- - -■ -Judging Event^Cova^ETTat ions' 
10 ^ 

limitacions. In addition, subjects will be asked to identify the best 
rule among our set of proposed strategies. 

Subjects for this experiment will be male and female college students, 
since our past research suggests that this age group should provide sub- 
stantial numbers of a. versus bt sum of. diagonals and conditional probability 
judges. Sex of subject will be considered as a factor in the design in 
light of c^?mmon findings of sex differences in math skills among adolescents 
and adults Ce*g*, Maccoby & Jacklin, 1974), 

Method 

Subjects 

Subjects in the experiment were students in an introductory psychology 
class who participated in the experiment as one option in fulfillment of a 
course requirement* Subjects ranged in ag^ from 18 to 32 year?, with a'mean age 
of 19.42, Sixty-two female and 54 male students participated. 
Problems - ^ 

Subjects Judged a set of 12 covariation problems, structured so that each 
of four judgment rules would produce a distinctive judgment pattern oti a problem 
set* Table la lists the actual problems used. The 12 problems include three 
problems for each of the four strategy types* One noncontingent and two 
contingent relationships are included for each strategy problem type. 

Twelve different problem contents were developai, each of which 
consisted of a sat of observations picturing one of two states for two 
potentially related everyday events. Three problems pictured bakery products 
which either rose or fell in association s^ith the presence or absence of 
yeast, baking powder, or a ''special ingredient**' In three other problems, 
plants were pictured as healthy or sick as a possible function of the presence 
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\ Rationale 

A much neglected area of research In uiathemat ical reasoning' Is 
that of children's understanding of statistical concepts. Statistical 
problen^j however, dp stand as prime areas for application of mathematical 
training. In particular, statistics are necessary for- identifying 
predictabilityin an enviroimient where relationships are freauentiy 
probabilistic is more likely when y is present) rather than deterministic 
(x always occui when y is present). Problems sucl> as these a*re common 
in identifying regularities in scientific phenomena, and in everyday 
contexts as well. In this respect^ statistics provide a key link 
between basic inathe!;ktat ical concepts and central aspects of scientific v 
and everyday problem solving. 

As an area for application of mathematical training, research on 
statistical reasoning may also be infonn^tive about children's ability 
to apply their mathematical skills appropriately* Central to probabilistic 
reasoning is understanding of ratios and fractions. Since a probability 
is a ratip between two frequencies, probability assessinent requires , 
that a person be able to identify the relevant frequencies and calculate 
the ratio between them. Thus, research in statistical reasoning 
should prove prc^fitable in understanding_children's acquisition of 
basic skills as well as their ability to use those skills in applied 
settings. 

Reasoning such as this underlies the call of several educators 
for development of training programs to Improve children's understanding 
of statistical concepts (e.g, , Harvey, 1975; Cambridge Conference on 
the CorreI<ation of Science and Math in the Schools, 1969), Research 
ip this area is critical for developing and tesfing such curricula in 
probability and statistics (e + g., Shepler, 1969; Kurtz ^ Kar'plus, 
1979; Ojeman, Maxey, fit Snider^ 1965 a & b) for children in the elementary 
through high school years* 

The focus of existing research in this area has b^en on children's 
probability judgments. Early work by PiaKet and Inheider (1975) 
indicated that full understanding of probability was realized by 
adolescence* Subsequent work by other investigator's indicates that 
younger children evidence some preliminary concepts of probability 
(e.g., Fischbein, 1975; Yost, Siegel, 6 Andrews, 1962; Goldberg, 
1966) > and that training is effective in improving their judgments 
(Ojeman, Maxey, & Snider, 1965 a £< b; Shepler, 1969^ Dunlap* 1980), 

A statistical judgment more common in causal rea^^onlng builds on 
probability, assessments of this sort. An individual investigating the 
relationship between potential cause x and effect y wou_\d compare the 
likelihood of x occurring when y is prest^at P(x/y) with the likelihood 
that X occur's without y P(x/y)* The two events are independent if 
these conditional probabilities are equal; no n independence is IndiCiited 
by '^ny difference* The comparison is made, to identify contingency ^r 
covariatiofi between events. Scientific procedure and st^itlsticai 
analyses testify to* the key role of covariation analyses In prof esisional 
practice. Although not sufficient for causal inference, covariation 
is a necessary condition between cause and event. Many psychologists 
f jrther assert that everyday causal judgment is similarly based on a 
covariation analysis (e*g*, Michotte. 1963; Inheider i Placet, 1956; 
Keiiey, 1967; Heider, 1958), That is. people search for lik^lv 
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explanations of everyday events by identifying event covari^€S+ Thus, 
competence at covariation judgment may determine a person*s adequacy at 
identifying real woild cause-effect^ relationships* 

Unfortunately, research investigating peopIe*s competence at judging 
covariations between events has resulted in a maze of c'ontradictory 
r^^sults* ,In the^basic paradigm; subjects are presented with data instances 
illustrating one"i>f two event states Ce.g+, presence or absence) for 
each of two events* The subject*s task is to identify the direction 
and/or strength of the relationship between the events* Inhelder and 
Piaget (1958) and Seggie aDd Endersby (1972) each found acc^uracy to be 
the norm among adolescent and adult subjects identifying such relationships* 
Others (ef.g., Kiemark, 1*975; Sraedslund, 1963; ^Jenkins £t Ward, 1965; Adi, 
Karplus, Lawson, & Pulos, 1970) have fouPd full competence to be rare 
among populations comparable in age a|id expertise^ \ 

i^ile Che evidence indicatres that covariation judgments are often 
erroneous, those judgments may be rule-governed nonetheless* Specifically, 
subjects may evaluate relationships according to a variety of rules," 
each of* which should produce a characteristic performance pattern* Fouj^i 
rules are proposed as possible judgment strategies* The ruleg are discussed 
in terms of possible relationships between two events (A ana b) , sach of 
which occurs In one of two states (I and 2)* Possible combii^ation^ of 
those event states are illustrated in Table I- 

^ Least sophisticate of the proposed strategies is judgment according 
to the frequencj; wttn which the target events cooecur (A^B,, cell a in 
Table 1), failing to consider Joint event nonoccurreruies (^2^2* ^^^^^^^^^^y 
table cell d) in defining the relationship* A subject using this strategy 
would identify a positive relationship t^tween A, and if cell a 
frequency was the latgest of the contingency table cells, a negative 
relationship if it was the smallest (cell £ strategy)* This strategy is 
identified by Inhelder and Piagej: (1958) as comsnon omong younger adolescents* 
Smedslund (1963) suggests that the strategy is typical among adults as 
well* The strategy doei^ consider some relevant informajiion and may 
result in better-than-chance performance* However," the rule is a limited 
one, a^d would be especially Misleading when there is a large difference 
between frequencies In contingency table cells a and d* 

A much improved approach would be the strategy defined by Inhelder i 
and Piaget (1958) as characteristic of formal operational thinking* ^ 
Specifically, covariation would be defined by comparing frequencies of 
events confirming (cells a ^nd d) *aiid disconf irming (cells b and c) the 
relationship* Thus, the rule vould compare the sums of the diagonal 
cells in the contingency table^(sum of diagonals strategy). Jenkins and 
Ward (1965), however, suggest that this strategy has its limits as well»X 
Specifically, the rule is an effective index only when the two states^of 
at least one of the variables occur equally often* Otherwise, a correlation 
may be indicated when, in fact, independence is the case* 

Instead, Jenkins and Ward (1965) suggest that covariation is more 
appropriately evaluated by comparing the probability of event Aj gi^^fen 
event B. P(A*/B ) with the probability of A^ given that B.^ has occurred 
P(A^/B^j* This is equivalent to a comparison of the frequency ratio in 

^ 7 * a b 

Table I cells with tjiat ^n cells By definition, independence 

is indicated by equivalence between these conditional probabilities;' 
non-independence is Indicated by any difference (conditional probability 



Table I 



a 


b 


c 


d 

> 
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strategy). This is the most sophisticated of oar proposed strategies 
and, should r^ult in accurate judgment of any contingency problein* 

^ ThuSj four alternative ^rategies were proposed to, account for 
subjects* judgment patterns. According to the analysis a subject's 
error rate should depend on tha^particular correlation problem lie or she 
is judging* Problems could be identified which woulji "be accurately 
judged by all four strategies* Alternatively, error rates may be high 
on problems solved only by the more general strategies* 
^' This analysis suggests a^jx>werful tool for identifying strategies 
actually used in covariation judgment* ySince different rules produce 
different judgments, covariation probltins might' be identified which 

^.^uld differentiate between those rulea* In fact^ careful structuring 
tf£ a problem set would allow us to'ideiltify the specific strategy a 

, subject is using. & r - 

A set of such problems is illustrated in Table 2a, Problems are 
st;ructured hierarchically such that Cill a probleij^ art.' correctly solved 
by all strategies; ^ versus b problems are correctly solved by a v,ersus 
b^. Sum of diagonals and conditional probability strategies. Sum of 
diagonal problems will be accurately judged by sum of diagonal and 
ccFfiditional probability strategies. Conditional probability problems - 
would b^ correctly solved by the conditional probability strategy alone, 
-Solution accfiracy is indexed by the direction of the judgetf relationship 
(i,e*, more likely given B.^ or no dif fe^rence) * A*subject's 
solution pattern on the set of problems indicates the strategy used* 
Problems on the first row of Table 2a illustrate judgments predicted by 
^ch of the proposed rules All problems in the row incfjcate relationships 
in which 'A -is more likely given B^. than given B^- However, an individual 
using the cell a^ strategy would judge only the first problem as such a 
relationship (cell a is tho largest of the cells). A person using the a 
versus b^ strategy would accurately judge the first two problems in the 
rowj but would say that A< given B^^ is as likely as A- given in the 
third problem (A-A) , and that A. wa^ less likely given than in 
the last problem (2-12)* The sum of diagonals rule would result in the 
correct Judgment of the first three problems, but would say that A* was 
as likely to occur with B. as with B^ on the last problem (2+10)"(I2+0) * 
A subject using the conditional probability rule should accurately judge 
all of the first row problems* An individuai!s solution pattern on the 
problem set would index 'the strategy he or she is usiag. Table 2b 
identifies the solution pattern congruent with each strategy type. The 
probabi-lity of matching these judgment patterns by chance alone is *U 
for cell ^) *04 for a versus b, ,01 for sum of diagonal and .005 for the 
conditional probability patt^th. 

In two experlmencs, Shaklee and Tucl^er (1980) employed this diagnostic 
approach to idetiti^ judgment rules of lOth grade and college sul?jects. 
Subjects judged relationships in three ptoblems for each proposed Strategy 
type. Each problem consisted of 2A instances in which event states were 
defined for two events* Problems were set in contexts of everyday 
events (e.g*, cake rises or falls at filgh or low temperature; plants 
healthy or not healthy which do or do not receive plant food). Subjects' 
performance indicated general conformity to the strategy set. Congruence 
with the cell a strategy pattern was frequent among the high school 
subjects (17%) but rare in the college sample (1%)* Response patterns 
matched that of the a versus b strategy for 19% of the college sample 
(use of this strategy was not tested among the high school subjects). 



i 



I c I 1 





\\ 


1- 


A, 


I 






6 


6 


A 

I 


6 






\ 




*i 


1 


6 


A. 


U 









IT] 

3 




^ ^. 










8 






4 


1 1 


S 


— ■ — 
I 



H 1 *• 



. 1 : IS 



«1. 






S 


7 


3 




._ 


4 


4 


IS 


1 



Prob<*blltty 





^2 

, 


i! 


12 


-0 


10 






1 

' 


S 




IS 




"2 


-r- 

12 


2 


10 


0 



0) ^iracegy u«« Aad iresultAot pAttt*rn« o't pr*-jloQ Accuracy. 



Sublet c 
S trattgy 









Smi ot 


Condi c loniil 


Coadltlon«l J 








Probttbltlcy 










Probabllltl^ii 








+ 


Sutti of 


+ 


+ 


+ 


0 








. 0 


0 


Cell A 


+ 


0 






Str«itt*gy 0 


0 


6 


0 


0 



Judgment patterns were congruent with the conditional probability strategy 
for 17^ of the high school subjects and 29% or the college sample* In 
each experiment^ the modal response pattern conformed to that of the sum 
of diagonals rule (3^% of^the college subjects^ AU of the high school 
subjects), ThuSj ovetwhelmingly^ subjects demonstrated at l,east some 
sophistication about appropriate covariation judgment* However; the 
optimal judgment rule was used by a minority^of subjects in the two 
:;amples* . , 

These initial investigatiorrs demonstrate the general success of 
our rule diagnostic approach* Subject judgment pattern^ indicated 
strong intraindividual consistency in rule use* Turthernorej the variety 
of ruless evident in these results suggest that characterization of group 
judgment by any single rule would be inappropriate. As with ail rule . 
modeling} congruence with a rule pattern requires ca^tldiis interpretation* 
That iSy a solution pattern conforming_to one of the predicted patterns 
could be the product of an alternative ^rule which produces judgments 
isomorphic with the proposed rule* However, congruence with a given 
pattern doss clearly identify the other proposed mo*del^ as poor c^haracteri^ations 
o"f the judgment rule* At the same time>\obtained. judgment patterns 
severely limit the pool of viable alternative models, ■ ' 

This, rule index offers an informativ^ method for the study of development 
in judgments of. event contingencies. Particularly useful is the possibility 
of identifying specific judgment rules which might be precursors" of more 
mature judgment competence* The ste^s in ou^r strategy hierarchy may 
represent a developing sequence of Increasinfely sophisticated' rule use. 
In factj two -of our proposed strategies, cell^a and a versus ^re 
"specifically identified by Inhelder ^nd PiageC^ (1958) as characteristic 
of younger adolescents* The two investigators \suggest^d th^t younger 
subjects would fail to ap4)reciate the relevence^of joint nonoccurrences 
of the target events^^(contingency table cell d) fn defiiVing relationships^ 
between event states, our cell a strategy* It w^s also ^ug^ested that 
these "subjects might compare this frequency of e^y^Sfii cooct;3t^ences with" 
the frequency with which one of the events occurl%Fithout the other one 
(contingency table cell b) > our a versus b strategy* The ^m of^diagonels 
strategy was believed to devfelop in later adolesc\ncej ac tne formal 
operational stage of development* Our rul^ diagnostic approach should 
allow us to track such shifts in strategy use, ; 

Shaklee and Mims (1981) tested college subjects and chirdren in 
Athj 7thj and 10th grade on the diagnostic problem set* Again, results 
indicated a closje congruence betw^^n actual and predicted judgment 
patterns* A significant developmental ti^end demonstrated shifts toward 
the use of increasingly accurate rules between the Ath to lOtk grade age ■ 
span* College subje£*ts' judgment^ were not significantly different from 
those of 10th graders* Judgment patterns matched the a versus b strategy 
for sizable groups of subjects at all ages (21% of the college subjects, 
23% of 10th graders, 25% of 7th graders,* 29^ of Ath graders)^. Sum of 
diagonals patterns were rare amotjg Ath graders (17^), but common among 
the older subjects (58% of college subjects^ 507. of 10th graders, 50% of- 
7th graders,)* Cpnditional probability judgment pattern^ were rare until 
"the lOtH grade (0% of Ath graders, A% of 7th graders, 27% of 10th graders. 
and*38% college subjects). Few people at any age level evidenced cell 
& judgment patterns. 
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R^sJlts of this developmental study suggest that development of 
covariation judgment mS^ b'e best conceptual i^ed as a series oT approximations 
to optimal rule^use. Early rules may afford |?etter than-chance performance 
although they are restricted in utility. With increasing age, subjects 
shift to more generally accurate rules. However, even ani^g mature 
Subjects, optimal rule use Is evidenced by a minority of subjects. * 

A final question of concern is the stability of judgment patterns 
acrosSv^judgm^t conditions^ Our research indicates that close congruence 
between subj^ts* judgments and tho^e predicted by our rules is maintained 
across a variety of conditions^ In past work, weVe var^^d the form in 
which the frequency information was presentefi, using individual dat^ 
instances pictured on 5 :c 8 cards (Exp€:;:iment 1, Shaklee & Tucker, 
1980), and sets of data instances pictured in a 2 x 2 table, (second 
experiment, Shaklee in Tucker, 1,980; Shaklee Mims, 1981). Most recently 
with college subjects we*ve presenced the frequency information in 
numerical form. In all cases 85%"90% of subject records conformed to 
one of the proposed rules. Additionally, we've manipulated <iuestion' 
form (Shaklee in Tucker, 1980, experiment 2) i'n testing rule use. Intone 
experiment, some subjects were asked about the association betweeffthe' 
events (e.g., ^does plant health' tend to be associated "with getting pldiht 
foodl not getting plant food, o'r is there no relationship between the ^ 
two?&, i^hile other subjects were asked about the likelihood of an outcome 
given the two possible states of the other variable 'e+g^, i^re plants 
who got plant food more likely, less likely, or equally likely to get • ^ 
well^ as plants who received no plant food). Subjects* judgments indicated 
that accuracy was higher in the latter response condition, but judgments 
were equally likely to match the rule patterns in the two conditions. v 

Finally, we conducted a pair of experiments to test rule use of 
college students making contingency judgments under meraory load conditions 
(Shaklee & Mims,^ 1982), Sin e everyday covariation judgment roust rely 
on I'ecall of past frequency information, we were interested in rule use 
under more comparable conditions* Frequency information was presented 
in slides, each of irfiich showed one combination of event states on the 
two variables. The instances in a given covariation problem were sly!>wn , 
sequentially to subjects. In one condition, subjects tabulated frequencies 
as the slides are shown, in a memory condition, they estimated- the 
frequencies "after all of the instances had been shown. All subjects 
were asked to judge the contingency between the events once the slide 
sequence was shown. Subjects in the memory condition were significantly 
poorer at frequency estimates and also used simpler, l&ss accurate 'riifles 
than subjects in the' no-raemory condition. In a second experiment. 
Subjects 'asked to remember distractor infonqation in addition to the 
event frequency information showed moi'e inaccurate estimates of event 
frequency information and use of simpler Judgment strategics than subjects , 
in a condition comparable to our prior memory condition. These two 
experiments indicate that covariation judgments under memory load conditions 
are substantially worse than those of subjects free of such meraory 
demands . 

'^n Sum, the data from several studies indicates that a cai;efully 
structured problem set can profitably used to indicate strategies 
underlying judgments of^ covariations between events* Such judgments are 
particularly inteifesting since they build so directly on the basic 
mathematical understanding of ratios and fractions. That is, people 



making covariation judgments should be comparing two conditional pro^abilitie 
each of which is a ratio between two frequencies* Ouf evidence indicates^ 
that substantial use of such a strategy doesn't occur until the 10th 
grade, and then by. only ir minority of th^ subjects* This evidende is 
congruent with other research indicating that problems in application of 
ratio concepts are couunon among adults as \7ell as^hildten (Karplus S 
feterson, 1970; Kurtz S Karplus, 1979; Capon S Kuhn, 1979). *' 
Our results have further implications for statistical reasoning as 

' a key link between math and science. Given the key role of covariation ^ 

' assessment in causal judgment, the evidence suggests that naivfe cau^l 
judgment may suffer from serious biases. That iqe, use of less-than- 
optimal' judgment rules may result' in erron<^ous perceptions of relationships 
between actually independent events, or failure to not^ Relationships 
between events which are, in fact, related. The data further indica:^ 
that judgment problems may begin at 4th grade, when ,;:^ildren begin to 
evidence reliable strategy use. Such limited rules should be particularly 
problematic as chlldfeo enter the more advanced scientific training 
progr'ams of junior- high and high school. Children may make progress in 
rule use during those adolescent years, but bi^ed judgment patterns 

/f>*irsist for the majority of people ^en at th^ college? years. 

While the evidjsnce clearly identifies strategy limitations among 
most subjects tested^ those stra<tegies may be'^bject to remediation. 
In fact> one of ou^ f revious experiments CShaklea ^ Tucker, 1^80) • 
indicates- tha^ training may indeed improve performance* This experiment 
incorporated tw<5? types of training: concept training and sort instructions. 
Half of the.subVects in thii experiment began theirs sessions with a 
dis<;ussioh of evi^t covariation^ , citing covaViates common to everyday 
lifg.and clarifying variations in direction and ^trength^of relationship. 
Coinparison subjects received no such instruotion,. Crossed with this 
manipulation v;ere instructions to sort the data instances (presented as 
decjcs of 5 x/ff cards) into a 2 x 2 matrix* Compajrison subjects wetfc not 
so inr3trucEM. Although sort instructions had no . significant effect 
(half of the subjects knew to sort the data without being instructed to 
do so), c mcept training did significantly improve judgment accuracy. 
While the eVidence indicates that training may be effective in mediating 

•cov^iriation judgment', the fin3'ing is somewhat general. More infomative 
would be an approach which develojjs interventions specific to strategy 
levels. J 

Further evidence of the potential effectiveness of training such - 
judgments comes from related work in probability judgment. Research by 
several investigators indicates that training improves probability 
judgments among children from 1st grade through bth'^rade (Ojeman, 
Maxey, Snider, i965a S bT Shepler, 1969; Dunlap, 1980). Since covariation 
judgment is a comparison between probabilities^ this research bolstered 
our expectation that training would be <5ff<5ctive in improving; rule use 
in covariation judgment as ^11. 

Grant S^upported Research 

. This program of research included a sequence of experiments designed 
to examiAe the effects of training on covariation judgment. That series * 
began with studi<^s to identity subjects* own understandings of their 
rules and sources of individual differences in rule use. The r<5maining 
experiments focus on ques^-ions about the trainability of those 1udj;ments, 
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Since all eiperimenCs employed the same rule analytic approach, the 
strate^gy index will first be described in detail- Discussion of specific 
experiments will follow*' 

' Rule Analytic Instrument 

Problems . Each subject's judgment strategy is identified through 
his or her solution to 12 dif f erent qovariat ion problemsr eacif^set in 
the cOntf^,xt-of everyday events- Twel^;^e different problem cojitents were 
developed, each of which consists of a set of observations picturing one 
of two states for two potentially, rel^ated evaryday events* Three problems 
picture bakery produc^&rf^ich either rose or fell in association witrf / 
the presence or absence of yeasf, baking powder, or "special ingredient*!* 
In three other prbblems, plants are pictured as healthy or sick as a * 
possible function o5, presence or absence of plant food, bug spray, or 
"special plant medjcine*'*^ In thrae problems people (or animals) are 
pictured as sick or healthy as a possible function of presence^or absence 
of a shot, liquid medicine, or a pill* The remai|aing three problems 
picture a possible association between space creatures' moods (happy/sad) 
and the presence or Absence of one of three weatljer conditions {snow, 
fog, or rain)^ j 

For each problem, d^ta instances are pictured in a 2 x 2 table- 
Example frequencies used are listed in Tabled** 2a and 3* Tabled ftequen/ie 
indicate one no'ncontingent and two contingent relationships for each 
strategy problem type* Direction of lelatlonship (A^ more likely giv^n 
B , B2 or no difference) is counterbalanced across subjects for \ 
problem content* ^ - ' ^ * ? 

Each problem is introduced with a paragraph describing a context^in 
which several observations were made on two potentially iTelated variables* 
Subjects are asked^to look at the pictured information and identify the 
relative likelihood of one of the events when the second event was . 
either present or absent*. An example Problem: ^ * ' ^ - 

Spacemen from Earth landed on another planet and" found 
creatures called the block-heads* ^JThey wanted to ses what 
, block^'heads were like, so they wat^fhed them jclosely- Eve.y 
Saturday they -would look'^outside* to' check the wgather and 
see.how the block-heac^s we're doing. Sometimes it was / 
snowing and sometimes it >,as not-* Sometimes the brock-heads 
were happy arid sipmetim^s they i*^re not* In the picti^ra 
you will 5;ee how many 'times each of^these tnings happened 
together* The picture indicates that when it was'snowing 
block-heads were: ' 

(circle one) a) more likely to be happy than 

b) just as likely to be happy as ^ 

c) less likely to be happy than 
when it ^ was not snowing* 

A similar paragraph and response form was developed for each*problem 
content* In each case, subjects indicate whether A. given was more ' 
likely, just as likely^ or less likely frhan A given 
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their account of systematic judgment bases. Thus, verbal accounts 
frequently underestimate judgment competence -in research with children 
(Brainerd, 1973; Bullock & Gelman, 1^79; Goldberg, 1966), Research with 
adultSj on the other hand, indicates that subjects* explanations often 
overestimate judgment sophistication. Both expert and nonexpert 
judges (Goldberg, 1968; Nisbett Wilson, 1977) describe themselves as 
using complex rules that bear little resemblance to the more simple 
patterns of their actual performance* 

In order to investigate these relationships college subjects were 
tested with the rule analytic problem set described above. After each 
covariation judgmecitj subjects were asked to rate their confidence in 
the accuracy of their judgments. Once the problem set was completed, 
subjects were asked to explain how they solved the last problem in the 
set (stated strategy) , to choose which of our proposed four rules was 
most like theirs (model choice), and to identify which of the four rules 
was the best one (best strategy) < Each subject was tested and interviex^ed 
individually. 

Results showed that problem difficulty level differed as a function 
of problem type, with mean accuracy decreasing as one moves up the 
problem hierarchy from cell a through conditional probability problem 
types (see Table 2, Appendix A for meaii6)< This pattern of problem 
difficulty replicates that seen in all of our previous studies. 
Subjects' confidence in their judgment accuracy also decreased as problem 
difficulty increased, indicating that subjects show at least some insight 
into the lir.its of their judgment rules- Miles were more accurate than 
females in covariation judgment, although there were no sex differences 
in confidence ratings. 

Judgment -based strategy classifications were determined as described , 
above. Most frequently occurring were judgment patt^erns congruent with 
a versus b and conditional probability rules (36, 2%. and 31<9% of the 
';amples respectively), Cell-^ and sum of diagonals classifications were 
less common (5,2% and 15*5% respectively). Males showed use of more 
sophisticated strategies than females (see' Table 3, Appendix A) in a 
pattern parallel to that found for judgment accuracy- 
Subjects* interview responses were compared to judgment-based rule 
classifications. Correlations were significant between judgment- based 
and both stated strategy (r = ,58) and model choice (r = ,A5) measures. 
However, examination of subject classifications shows that judgment- 
explanation agreement was substantially high^c^for conditional probability 
subjects (97%) than for the other strategy grMps (24% for other groups 
combined), suggesting that some subjects knew more about what they were 
doing Chan others. Subject choices of the best rule were found to be 
reliably more sophisticated than their descriptions of their own strategy 
(by model choice measure) < 

Overall, these results suggest that self-report may be a weak data- 
base for research on covariation judgments In particular, self-report 
may be a poor method for diagnosing sources of error in covariation 
jud<;ment. Our finding of strategy classification differences in self- 
report accuracy are somewhat ironic from an educational point of view. 
That is^ the students best abl^^ to report their problem solution methods 
would be tho- who are most accurate in judgment and, hence, need help 
the Isast* 

Experiment 2: Predictors of Rule Use Among College Students 

Our consistent evidence throughout all of our work indicates that 
most subjects of a ^iven age use a systematic rule, but that those 
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were con£used by the tables indicating event frequencies. As a result, 
we expanded our introduction of the tabled information on two practice 
problems, discussing the contents of each table cell and checking comprehend 
by asking the children to point to table cells with particular event- 
state pairings. We also wondered if our covariation judgment question 
was excessively complex Syntactically. Thus> in our revised procedure, 
we modified the question to read (using the blockhead problem cited 
earlier) : 

The picturer indicates that blockheads were more likely to be happy: 

a) when it was snowing 

b) when it was not snowing 

c) no difference 

These modifications ware made to make our problems more comprehensible 
to younger children. It turns out that we outdid ourselves in this 
respect. Testing a new sample of children, nearly adl of our subjects 
were classifiable by one of our rules? in the fourth grade, and a majority 
of children showed systentiatic rule use ii^ sthe second and third grades* 
Overwhelmingly, these subjects were classified as using the a^ versus b^ 
rule (see Table, 3, Appendix C) . Thuc, thes< result;^ indicate that 
systematic rule use is clearly within the competence of fourth grade 
children, and is also common among second and third grade children, in 
light of these findings this modified procedure was deemed more appropriate 
for use with subjects in the elementary school years. * 

Experiment 4* Elicltin<5 Reliable Rule Use 

(See Appendix C, Experlmeni: 2 tor detailed discussion of this experiment) 

Our modified procedure indicates thai reliable rule use is conunon 
at an earlier age than our previous evidence indicated, but we still see 
that judgments are frequently unsystematic among second grade children* 
As a result, we were concerned iibout vhe origins of systematic rule us 
in judging covariation between events. Training paradigms are commonly 
used by psychologists to identify sources of developmental trends. If 
one can identify a training approach which leads an'^individual to show 
reliable rule use* contents of that training approach may indicate key, 
aspects of knowledge that result in the natural acquisition of the rule. 
Of course, successful training indicates only one sufficient model of 
the natural developmental process. The real life transition may follow 
Some alternative sufficient process* 

We turned our attention to identifying origins of reliable rule use 
among first and second graCe children- We chose not to train childre;n 
in use of the cell a^ riile since it so rarely occurred naturally* Instjfead, 
we developed training programs designed t^ elicit use of the a versus 
rule. ' 

Our training approach stemmed fium our suspicion that the judgment 
question itcelf focused children's attention on cells a and b of the 
contingency table. Asked ^f : lants are/more likely to be heal*-hy when 
they get bug spray or whsn they don*t get bug spray, a subject may look 
at those two event conjunctions (i*e* , healthy plants-bug spray; heilthy 
plants-no bug spray). We thought of rhiJ a problem of attention^ 
direction* This w^s the reasoning beliind our Attent ion _ c> nly condition. 
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The results of this training experiment are shown In Appendix 0, 
Table 2. Note that unclass:,f iable posttest subjects vere not included 
in the analyses. Trained subjects were significantly more likely to 
show use of the sum of diagonals rule both at the immediate and at the 
delayed posttest. This evidence indicates th^t subjects can Indeed show 
improved rule use with a relatively simple training procedure* These 
training procedures were similarly effective among the younger and older 
subjects in the sample. Our training in confirming and disconf irming 
cases not only yielded better accuracy, but those judgmerts also conformed 
to the pattern predicted by the sum of diagonals rule. This suggests 
that this seasoning may well underly the natural acquisition of this 
rule in children's development. At a minimuln^ these training effects 
identify one sufficient model of this developmental process. 

Experiment 6: Eliciting Use of the Conditional Probability Rule 

Although all of our proposed rule^ may produce better-than-chance 
accuracy in covariation judgment, the conditional probability rule will 
correctly judge any covariatj^on relationship. As a re*;ult. It is a 
matter of considerable educational significance tc investigate the 
trainability of this rule. In view of the low incidence of- use of this 
optimal "rule at all ages tested, we should be especially motivated to 
find ways to improve judgment accuracy. 

Our evidence thus far indicates that the conditional probability 
rule is the most difficult rule to train subjects to use. The subject 
population for this study has included seventh and eighth grade children 
who pretest as using the a_ versus b or sum of diagonals rules* Our 
first training approach simply taught subjects to identify the components 
of the relevant conditional probabilities. For example, on a problem 
about ^he effects of special plant food subjects were asked to point to 
the plants that got special food and count how many were there* They 
were then asked how many of these were healthy^ In the same way, they 
pointed to the plants that did not get special food and noted how many 
were healthy. They then answered the covariation judgment question for 
the problem. Subjects were corrected if they made errors in identifying 
the components of the conditional probabilities, but received no feedback 
as to the accuracy of their covariation judgment* This procedure was 
repeated for 6 training problems* We call this condition Components 
trainings Our evidence shows that subjects who received this training 
were no more likely to show use of the conditional probability rule than 
no-training control subjects at either immediate or delayed (one week) 
posttest. The training was similarly ineffective for subjects who ^ 
pretested as using either the a versus b or sum o£ diagonals rules* 

In view of these results > we amplified our training to make it much 
more explicit about how to combine the components of the conditional 
probability into two ratios, and how to make comparison^ between them* 
Subjects then made their covariation judgments ^for the problem. Incorrect 
responses at any point were corrected including the covariation judgment 
ir.self. This procedure was repeated for 6 training problems. We call 
this condition the Ratio-comparison conditon* Ai^iin> subjects were 
junior high students who pretested as using the a^ verjus or sum of 
diagonals rules. Results of this study show that sum of diagonals 
subjects given Ratio-comparison training are no better than control 
subjects in judgment at immediate or delayed (one week) posttest. 
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a tabled or broken time line format* Thesfe studies suggested to us that 
conclusions abQut event sequences may vary as a function of stimulus 
presentation conditions and may contribute to relative accuracy of 
covariation judgni^nt. 

Most recently, we've been looking at information sampling strategi(iis 
used by subjects to test covariation and/or causal relationship. Our 
past research has presented subjects with information about the relative 
occurrences of events and asked them to draw a conclusion about the 
depicted relationship. However, if subjects themselves wish to test a 
hypothesis about an event relationship, what information would they 
seek? 

A common pattern found in related research is a tendency to restrict 
one's sample to only a subset of the potentially relevant information 
(e*g*, Shaklee fit Fischhoff, 1982)* In our case, a subject asked whether 
an outcome is associated with one of two event states n^ight prefer to 
sample information about one of those event states, thereby gathering 
less information about the other event-state. 

Such a sampling bias would have differential impact on Judgment 
accuracy with each of our covariation judgment rules^. For instance, the 
su^ of diagonals rule is an accurate estimate of a relationship only 
when the two alternative states 3f at least one of the two events occurs 
equally often* The problem* is easiest t;o demonstrate if the rule is 
reconceptualized in terms of its mathematical equivalent, (a-C)-(b-d), 
Thus, an individual compares the difference between the cells in the 
first column t)f a contingency table with the difference between cells in 
the second column. Those differences are only comparable estimates of 
likelihood if the column totals are equal; otherwise, the same sized 
difference represents a larger proportion of total ^instances for the 
minority event than for the majority event* The problem becomes more , 
extreme as the difference in column totals increases, making^ Che sum of 
diagonals rule an increasingly inaccurate estimate of differential 
likelihood* The a^ versus rule has the same weakness In addition to 
other problems. The conditional Probability rule is the only strategy 
that will support accurate judgments with biased sampling* However, 
even a condit^ional probability rule will not handle the most extreme of 
sampling biases* That is, if an individual only gathers data under one 
event state and never samples information about the alternative state, 
covariation judgment cannot be at better than. a chance level of accuracy. 
Such an individual will only know one of the two probabilities relevant 
to the comparison* Through preferential sampling of alternative statesi, 
people may actually generate difficult covariation problems from relationships 
that would otherwise be simple to evaluate* 

We've developed a paradigm for investigating Information search 
strategies used in testing hypotheses about events. Subjects were 
presented with a large envelope containing the universe of observations 
about two potentially related events on another planet* The large 
envelope was introduced with a description of a potential relationship 
bptween the behavior of some space creature (e,g*, sleep/awake) and time 
of day (daytime/nighttime). It contained two smaller envelopes, each of 
which contained observations of the creature's behavior under one of the 
conditions* Thus, subjects had one eavelope of daytim<i observations and 
one envelope of nighttime observations* Each observation was pictured 
on a 3*' X S'^^-^card in the envelope* Subjects were to select a total of 
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24 observations from the two envelopes in such a way as to best test a 
hypothesis about the events* Each hypothesis stated an association 
between the time of day and one behavioral state (e*g*j being awake is 
aijsoci^ted with nighttime)* Subjects recorded their 24 observations on a 
record sheet, and conclude<^ that the hypothesis was either true oi 
false* Subjects judged 3 such problems, including a positive, negative, 
and a ^ero relationship* These subjects also judged our strategy diagnostic 
problem set. Our intiial sample included third grade, seventh grade, : 
and college subjects* We defined a subject as biased in sampling if the 
mean absolute differences in- samples of the two event states across the 
three problems was greater tl\an or equal to 8. (The range of these 
means could be from 0-^4 *) Sampling tendencies indicate that biased 
sampling was common at all ages tested (50% of 3rd graders, 37% of 7th 
graders, 32% of college subjects.) When one looks closer at the nature 
of the sampling b:^s, the overwhelming majority of cases are those in 
which subjects sampled solely frpm one envelope (day or night), ignoring 
information about the alternative event state* As noted earlier^ this 
is an extent of bias that even the conditional probability strategy 
cannor. accurately evaluate* One cannot compare two conditional probabilties 
with no information about one of those probabilities* Accuracy cannot 
be at better than chance levels under these circumstances. In overview^ 
this information sampling paradigm does show substantial differential 
sampling among subjects from third grade through college age* in view 
of the extremity of the bias, sampling patterns such ^s these would be 
devastating to accuracy of covariation and causal judgments alike* 

Overview of findings 

tJIE support has allowed us to investigate a variety of questions 
about a common form of statistical reasoning, dSyvariation judgments Our 
past work had indicated that use of systematic but simple rules began in 
the fourth grade and that'' subjects used more sophisticated rules with 
increasing age* Our recent research supplement;s this research in several 
important ways* 

First, we found that a modified testing procTedure results in spontaneous 
use of systematic rules at an earlier age (i*e*, 2nd-4th grade) than our 
previous study would* indicate. In addition, we found that a simple 
training procedure would reliably elicit use of the £ versus b rule in 
the first and second grades* This would suggest to us that elementary 
school children have important competencies in understanding probabilistic 
relationships and may indicate that science or math demonstrations of 
probabilistic relationships would be suitable for children in the early 
prlin<iry grades* 

Secondly, we find that a more advanced rule (sum of diagonals) "^an 
be acquired in the later elementary school grades with a simple training 
procedure* Contents of that procedure maj^ suggest approaches which 
should be similarly effective in improving judgments about event covariations 
in classroom demonstrations in these school years* However, we find 
that the conditional probability rule is not easily trained in junior 
high children by the methods we tried* One interpretation of this 
finding vould be that training in use of this optimal rule night be 
better delayed until the high school, or even college years* However, . 
our evidence also indicates the importance of training students in these 
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Judging Even-t Covariations 



Abstract 



Past research indicates poor agreement about strategies peo^l^ use co 
assess covariation between events. This research investigates mejchod 
of assessment as, one possibLe source of chis low consensus*^ A s&t oi 
problems was developed in such ^ way that different judgment r^les would 
produce different decisions about the relationships between events* 
College subjects judged these problems, then were asked to explain their 
judgment strategy. In addition, t^^ey^wei^e shown model strategies and\ 
asked to choose the one like their owii strategy and the model that would 
be the best strategy* Subjects whose 'judgment? indicated use of the most 
sophisticated strategy were quite ^ accurate in reporting .their judgment 
rules ^ Subjects using the less accurate rules most commonly ref^orted 
using strategies which could not -have produced the obtained pattern of 
problem solutions. These findings suggest that" self -^aport is a weak. , 
basis for conclusions about sources of error in covariation judgment. 
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Judging Event Covariacions 



Statistical concepcs represenc one prime area for application of 



mathemacical training* In particular, statiscics are necessary for 



identifying predictability m an environment where relationships are 



frequently probabil^istic (y is more likely when x is present) rather 
than deterministic (y alw<Lys occurs when x is present)* Problems such 
as these are conmion in identifying regularities in scientific phenomena, 



and in everyday contexts as well* In this respect, stacistics provide 



a key l^nk between basic matheiiatical concepcs and central aspects of 



scientific and everyday problem solving* As an area for application of 



mathematical training, research on statistical reasoning may also be' 



informative about children^fs and adult's abilities to apply their mathematical 
' skills appropriately. 

The focus of existing research in this area has been^on 



probability judgments Ce*g,, Piaget 6t Inhelder, 1975; ^^ischbein, 1975; 
Yost, Siegel 6t Andrews, 1962), A statistical judgment common to reasoning 
[ about cause-effect relationships builds on probability assessaients of this sort, 
' An^individi'.a] investigating i-he relationship between potential cause x 
and effect y would compare the likelihood of y occurring when x is present 
P(y/x) with the likelihood that y occurs without x P(y/3l)* The two events 
are independent if these conditional probabilities are equal; nonindependence 
is indicated by any difference* The comparison is ma<}e to identify 
contingency or covariation between events. Scientific procedure and 



statistical analyses testify i^o the key role of ovariation analysis in 
professional practice* Although not suffliient for causal inf erence > 
covariation is a necessary condition between causes and evencs. Thus, 




covariation analysis may identify the set of possible causes of an evenc. 
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3 

Many psychologists .further assert that everyday causal judgment is similarly 
based on a covariation analysis (e.g., Michotte, 1963; Inhelder & ^ia^et, 
1958; Kelleyi 1967; Heider, 1958). "That is, people search for likely^ 
e:cplanations of everyday events by identifying event covariates. Thus, 
competence in covariation judgment may detercsine a person^s adequacy in 
identifying real world cause-^eftect relationships. 

In fact, a variety pf investigators have found that adolescent ' and 
adi'lt subjects show little competence in identifying event covariations 
(Niemark, 1975; Smedslund, 1963; Jenkins & Ward, 1965; Adi, Karplus* Lawson, 
£i Pulos, 1978). While the evidence indicates that covariation judgments 
are often erroneous, those judgments .may be rule-governed n<^netheless . 
Several different rules have been proposed by past investiga^iors as 



possible judgment strategies* These rules are discussed in terms of possible 
relationships between two events (A and B) , each of which occurs in one 
of two states (1 and 2). 

Least sophisticated of the proposed strategies is judgment according 



to the frequency with which the target events cooccur (^^^^» cell a in a 
traditionally labeled contingency table) failing to consider the other 
event-state pairings (^^^2* ^2^1* ^2^2^ defining the i;elationship . A 
subject using this strategy would identify a positive relationship between 
A^ and if cell a frequency were the largest or ihe contingency table cells, 
a negative relationship if it were the smallest (cell strategy). This 
strategy is idenciCied by Xnhelder and Piaget (1958) as common among younger 
adolescents* Smedslund (1963) and Nisbett and poss (1980) suggest that the 
strategy is typical among adults as well. The strategy does consider some 
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relevanC informacion and may result in better-chan-chance perfonnance. 

However) the rule considers only a limited portion of the information 

chat defines the relationship aTid would result in erroneous judgment of 

tnany relationships. 

A second possible approach would compare the number of times target 

events and S^^ cooccur with the times "Occurs with B2 (comparison of 

frequencies in contingency table cells a and b; strategy a versus b^) ^ This 

strategy is also identified by Inhelder and Piaget (1958) as a precursor of 

^atur jbdgmenc. Again this strategy considers some of the relevant information 

and may resi^lt in accurate judgment of many event contingencies. However, 

failure to consider frequencies in cells c and d (event combinations A^Bj^ 

and ^232) would be a particularly costly error when Che direction of that 

* 

frequency difference Is the same as the difference between cells a and b- 

A much Improved approach would be the strategy defined by Inhejder 

and Piaget (1958) as characteristic of formal operational thinking. 

Specif Icaily, covariation would be defined by comparing frequencies of 

events confirming (cells a and d) an^ disconf irming (cells b and c) the 

relationship* Thus, the rule would compare the sums of the diagonal cells 

in a contingency table (sOto of diagonals strategy). Jenkins ?*id Ward 

(1965) J however) point out that ^his strategy has its limits as well. 

Specifically, the rule is an effective index only when the two states of 

* 

at least one of'the variables occur equally often. Otherwise* a correlacion 
may be indicated when, in facts independence is the case. 

Instead, Jenkins and Ward (1965) suggest that covariation is more 
appropriately evaluated by comparing the probabilicy of event given 
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evenc pCa^/Bj^) with the probability of given that ^2 occurred 

P(A^/B2)* This is equivalent to a comparison of the frequency ratio in 

a b 
contingency cable cells with that in cells By definition, 

^independence is indicated by equivalence between these conditional 

probabiXiti3s; nonindapendence is indicated by any difference (conditional 

probability strategy). This strategy should result in accurate judgment 

of any contingency problem. 

Thus, four al::ern*\tive strategies have been proposed to account for 
subjects' judgment patterns* Many of these rules were proposed on the 
basis of subjects' explanations of theil judgments. For example, Smedslund's 
(1963) cell ^ strategy is based on the reports of over half of his sample 
that they judged the relation of sjrmptom A and diagnosis F according to the 
number of AF pairing;*. Adi, Karplus, Lawson, anJ Pulos (1978) similarly*- 
categoriEed subjects aiccording to their explanations. In this case, however, 
no subjects were classified as using $ cell a^ strategy. Rather, subjects 
described themselves av using various co^pbinations of two to four of the. 
contingency table cells* Thus, two samples of subjects offer considerably 
different explar^ations uf their judgment strategies* Two features of these 
studies make it hard to recgncfile these differences. First, the two reports offer 
little information on the way the explanations vere elicited. We might expect that 
different questions would result in different responses. Secondly, neither 
of the investigators raport the level of agreement with which subject 
responses were categorized, so we know little about the reliability of the 
categorization schemes. 

However* a more serious problem is t^rilevant to any explanation-based 
strategy analysis. ' Tliat is, such an approach is predicated on the assumption 
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t\2t subjects are able, and willing to accurately describe their bases of 
judgment. In fact* a variety of research in psychology suggests that this 
ai>sumption may not be justified. In developmental research in particular, 
ycung children's poor verbal skills nay hinder their account of systematic 
judgment bases* ThuS) verbal accounts frequently underestimate judgment 
competence in research with children (e.g., Brainerd^ 1973; Bullockj Gelman, 
& Baillargeonj in p^^ss; Goldberg, 1966)* Research with adults* on the 
other handj indicates that subjects' explanations often overestimate judgment 
sophistication* Both expert and nonexpert judges (e*g*> Goldberg^ 1968; 
Jfi^bett £i Wilson, 1977) describe themselves using complex rules that bear 
little resemblence to the simpler patterns of their actual performance- 
Ericsson and Simon (1980) note that relative accuracy of verbal reports may 
depend on the conditions under whic/ the information is gathered- These 
findings would suggest that explanation-based analyses of judgment strategies 
sh aid oe treated with caution, 

An alternative approach would be to analyze judgment strategies on 
th : basis of subject's actual performance patterns (Ward & Jenkins > 1965; 
Jenkins & Ward, 1965; Shaklee 6 Tucker, 1930), That is, four different 
rules have been proposed to account for subjects' judgments of event 
covariations. Since different rules produce different judgmentSf 
covariation problems cotjld be identified which would differentiate between 
those rules. In fact, careful structuring of a problem set should allow 
us to identify the specific strategy a subject is using, 

A :;et of such problems is illustrated in Table la. Problems are 
structured hierar(;hically such that cell ^ problems are correctly solved 
by all strategies; strategy a versus b^ problems are correctly solved by 
a versus b^, sum ot diagonals, and conditional probability strategies. Sum 
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of diagonal problems will be accurately judged by sum o£ diagonal and 
conditional probability strategies. Conditional probability problems 
would be correctly solved by the conditional probability strategy alone. 
Solution accuracy Is indexed by the direction of the judged relationship 
(i.e., more likely given B^, , or no difference). A subject's 
solution pattern on the set of problems indicates the strategy used. 
Problems on the first row of Table la illustrate judgments predicted by 
each of the proposed rules. All problems in the row indicate relationships 
in which is more likely given than given . However, an individual 
using the cell a strategy would judge only the first prbbles as such a 
relationship (cell a is the largest of the cells). A person using the a 
versus b strategy would accurately judge the first two problems in the row, 
but would say that Aj given B^ is as likely as A^ given B2 in the third 
problem (2-2), and that Aj was less likely given B^ than B^ in the last 
problem (2-12). The sum o| diagonals rule would result in the correct 
judgment of the first three problems, but would say that A^ was as likely 
to occur with B^ as with B2 on the last problem (2+10) - (12+0). A subject 
using the conditional probability rule should accurately judge all of the 
<first row problems. Table lb identifies che solution pattern cPngruent 
with each strategy type. The probability of matching these judgment patterns 
by chance alone 's .11 for cell a, .04 for a versus b, .01 for sum of 
diagonal, and .005 for the dondxtional probability pattern. 

U two experiments, Shaklee and Tucker (1980) employed this diagnostic 
approach to identity judgment rules of 10th grade and college students. 
Subjects judged relationships in three problems for each proposed strategy 
type- Each problem consisted of 24 instanc-s in which event states were 
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defined for cwo evecics* Problems were set in contexts of everyday events 
(e.g., cake rises or falls with or without "special ingredient," plants 
healthy or not healthy which do or do not receive plant food). Subjects' 
performance indicated general conformity to the strategy set. Congruence 
with the ceJL^^ strategy pattern was frequent among the high school subjects 
(VZ) hut i^are in the college sample (1%). Response patterns matched that 
of the a versus b strategy for 18% of the college sample (use of this 
strategy was not tested among che high school subjects). Judgment patterns 
were congruent with the conditional probability strategy for 17% of the high 
school subjects and 33% of the college sample. In each experiment, the modal 

response pattern conformed to that of the sum of diagonals rale (35% of the 

> 

college subjectsv 41% of the high school subjects). Subsequent studies 
demonstrated that children use increasingly sophisticated rules with 
increasing age in the 4th grade to college age span (Shaklee ^ Mims , 1981), 
and that adults tend to use simpler rules as the de^tision environment 
becomes more complex (Shaklee fit Mims> 1982)* ■ 

in sum, the data from several studies indicate that a carefully 
structured problem set can be profitably used to indicate strategies under* 
lyiag judgments of covariations between-events. Subjects in these experiments 
demonstrated at least some sophistication about appropriate covariation 
judgment* however* the optimal judgment rule was used by a minority of subjects* 
Such judgments are particularly interesting since they build so directly on 
the basic mathematical understanding of ratios and fractions* That is* people 
making covariation judgments should be comparing tvo conditional probabilities, 
each of which is a ratio between tvo frequencies* Our evidence indicates 
that substantial use of such a strategy does not occur until the lOth ^rade* 
and th^n by only a minority of subjects. This evidence is congruent with 
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oCher research IndicaClng chat problems in application of ratio concepts are 
cotnmon among adults as well ^s children (Karplus & Peterson, 1970; Kurtz i 
Karplusj 1979; Capon & Kuhnj 1979). 

In addition, these findings conflict with the past interviev-based 
strategy analyses. In particular, Smedslund's (i963) only commonly 
reported strategy, cell _a. Is rarely seen in the performance patterns j)f 
our, subjects* In light of this conflict, a direct comparison of explanation 
and judgment-based strategy analyses would be profitable* By this approiich) 
subjects would be asked to complete a diagnostic problem 'set, then explain 
their judgment bases* Comparison of classification by the two methods might 
show areas of systematic disagreement. In addition, interview responses 
offer new information in evaluating our juagment-based analysis. That is, 
subjects may describe themselves as using rules which may differ from any 
of our proposed rules, but which would produce a judgment pattern on the 
problem set congruent with that of one of our rules* Finally, we learn 
something about subjects' insight into their own reasoning* Such under-* 
standing of subjects* own impressions about their task solutions would be 
particularly important in any attempts to improve judgment com^^etence. 
That is, training nay be maximally effective. when it is oriented toward 
the individual's own understanding of his or her rule use. 

A second Interest in this study is in subjects' evaluations of the 
ade<iuacy of the rules they use. Those using less sophisticated rules may 
Or may not be aware of rule limits. ' This study will measure judgments of 
r\ile ade<quacy by asking subjects to give confidence ratings as they make 
thair judgments in the problem sec* Subjects who are less confident of 
erroneous responses than of correct responses must be aware of their rule 
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limications* In addition, subjects will be asked to identify the best 
rule among our set of proposed strategies. 

Subjeccs for this experiment will be male and female college students, 
since our past research suggests that this age group should provide sub- 
stantial numbers of ^ versus Jbj sum of diagonals and conditional probability 
judges. Sex of subject will be considered as a factor in the design in 
light of rrmmon findings of sex differences in math skills among adolescents 
and adults (e*g*, Maccoby & Jacklin, 1974), 

Method 

Subjects 

Subjects in the experiment were students in an introductory .psychology 
class who participated in the exixeriment as one option in fulfillment of a 
course requirement* Subjeccs ranged in ag^ from 18 to 32 yearS) with a^mean age 
of 19.42* Sixty-two feiaale and 54 male students participated. 
Problems - ' 

Subjects judged a set of 12 covariation problems, structured so that each 
of four judgment rules would produce a distinctive judgment pattern ort a problem 
set. Table la lists the actual problems used. The 12 problems include three 
problems for each of the four strategy types* One noncontingent and two 
contingent relationships are included for each strategy problem type. 

Twelve different problem contents were developed, each of which 
consisted of a set of observations picturing one of two states for two 
potentially related everyday events. Three problems pictured bakery products 
which either rose or fell in association with the presence or absence of 
yeast) baking powder* or a ^'special ingredienc," In three other problems, 
plants were pictured as healchy or sick as a possible function of the presence 
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or absence of plane food, bug spray, or a "special ^nedicine." In chree 

problems people or animals were pictured as sick or healchy as a possible 

funccion of che presence or absence of a shoC> liquid medicine^ or a pill. 

The remaining chree problems pictured a possible association between space 

creatures appearing happy or sad in the presence or absence of one of three 

weather conditions (snow, fog, or rain). 

For each problem^ data instances are pictured in a 2 x 2 table. In 

^ach case* the manipulated factor (or environmental event) defined the table 

columns (^.g. plant food, no plant food in example below) » and the outcomes 

defined the' table rows (plants healthy^ not healthy in the example below). 

Each problem is introduced with a paragraph describing a jcontext in which 

several observations were made on two potentially relatel^ variables . 

Subjects were asked to look at the pictured information and to identify 
* 

the relative likelihood of one of the events when the second event was 

t 

r 

either ps^sent or absent. An example problem follows: 

A plant^grower had a bunch of sick plants. He gave 
some of them special plant £ood» but some plants didn't , 
get special food. Some of the plants got better but some 
of them didn't- In the picture you will see how many times 
these things happened together* The picture' indicates that 
plants which were given special food were: 

+3 +2 +1 0 -1 "2 -3 

much somewhat a bit just a bit somewhat much 

more more more as less less less 

' likely likely likely likely likely likely likely 



to get better than plants that weren't given special food. 
On your answer sheet write the scale number that best \ 
completes the sentence. . \ 



In addition, after each covariation judgment subjec*'s were asked to rate 
their confidence as follows: 
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, How certain are you in the accuracy of the above response? 

1 2 3 4 5 7 7 8 9 10 
just gi^essing absolutely certain 

The 12 problems were grouped into problem blocks, including one 
problan from each strategy type. Problems within each block were arranged 
in a single random s^^uence* The three problem blocks were sequenced in 
a single random order* Numbers in parentheses' to the left of the problems 
in Table la indicate the position of each problem in the problem sequence* 

Once the problem set was completed each subject was interviewed and 
asked the following questions about his or her judgment: 

la* You've just completed several problems about the relationship 

between events* Can ypu tell how you solved them? 
lb, (Experimenter turns to the last problem in the set - a conditional 
probability problem*) Can you use this problem to show me how 
you solved it? (strategy explanation) 

2. (The participant "is shown models of the strategy types while 

* t 

they are described*) Can you indicate, from the models presented, 
I 

the strategy you used to solve Che problems? (model choice) 
3* Overall, which do ycm feel is the "best" strategy? (best strategy) 
Each subject was tested and interviewed individually* 
Instructions 

Initial instructions introduced the subject to the concept of covariation 
in the context of "things that go together*** I^aturally occurring tramples 
vere given of positive relationships (i*e,, tall people are more likely to 
be heavy than short peofsla) , negative relationships (i.e,, it is less likely 
to rain when it is sunny than when it: is cloudy), and unrelated events (i*e*> 
a green truck is just as likely to run out of gas as a red truck)* Subjects 
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were told that they would be given some problems about hypothetical events 

that may or may not tend to occur together, A sample problem involving the 

occurrence of snow as it ^id or did not relate to atmospheric temperature 

was used to explain the stimulus materials and the problem format. Each 

subject gave ^l solution to the sample problem and was invited to ask_ 

t 

questions about the task* Subjects were allowed to progress through the 
problems at their own pace and were encouraged to use the scratch paper 
provided if they desired. 
Results 

Results can be grouped according to their relevance to two issues, 
First J subjects^ performances can be characterized in terms of the accuracy 
of problem solutions. Confidence ratings on these problems indicate subjects' 
beliefs about their accuracy. Secondly* judgment strategies are identified 
according to subjects' solution patterns on the problem set and their 
responses to the interview questions. 

Accuracy , Accuracy was assessed in terms^^^j^^'^he direction of the 
judged relationship (i,e,, Aj_/E]_ more* less or equally likely than 
Data are analyzed in terms of the number of problems correct per problem 
type. Relevant means for this an^flysis are reported in Table 2. a sex by 
problem, type analysis of variance shows a main effect or problem type 
(fC3,342) =■ 16A,36» £ < .001) with mean accuracy of 2,88 for cell a, 2,65 
for a versus b^j 1.A7 for sum of diagonals^ and 1,21 for conditional 
probability problems* A main effect of sex of subject was also significant 
(FCi)U4) - 6.67* £ < ,Ol)j w^th more problems correctly solved by males 
than by females. The seK by-problem type interaction was also significant 
,08) 2 = ,03)) with the greatest sex differences in accuracy 
for the sum of diagonals and conditional probability problems (see Table 2), 

^ i 
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A sex by problem type analysis of variance of confidence ratings showed 
that subjects had some insight into solut^Lon accuracy. This was reflected 
in a significant effect of problem type on confidence-gratings, with 
confidence decreasing ks problem difficulty increased (F(3,342) ^ 25.60, 
£ < .001). Mean confidence ratings were 8.5 for cell a^, 8.4 £or £ versus b^, 
7.8 for sum of diagonals^ and 7>7 for conditional probability problems,^ 
Confidence judgments did not differ by sex either as a main effect or in ^ 
interaction with problem type. 

Strategy . Each subject's pattern of solution accuracy on problems of 
the four typas was used to identify, his cr her judgment strategy. Performance 

patterns congruent with the four strategies are illustrated in Table lb. 

( 

A subject wafe said to have passed criterion* on a given problem type if he 
Or she^was accurate on twd or more of the three problems of that type. A 
conditional probability subject should pass criterion on all problem types, 
sum of diagonals judges should Pass criterion on all problem types except 
the conditional "probability problems. Judges pusing the a^ versus b^ rule' 
should pass criterion on cell _a and £ versus b_ problems. Cell _a subjects 
should pass cell a^ problems alone* Someone who passes no criteria would 
be labeled Strategy 0. Judgment patterns that do not match any of these' 
predicted patterns are classified as "other^'' Classification by this method 
will be referred to as the judgment-based strategy. 

Distribution of these judgment-based classifications is illustrated for 
each of the two sexes in Table 3. These results indicate that all subjects 
passed at least one criterion, indicating that they understood the stimuli 
and had at least a simple understanc^^'ng of the judgment to be made. Most 
frequently occurring were judgment patnarns congruent with a^ versus _b and 
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conditional probability rules (36,2% and 31*9% of the samples respectively). 
Cell ^ and sum of diagonals classifications were less cbm^n (5*2% and 
15,5% respectively)* Judgments of 13 subjects failed to'taatch any of our 
proposed patterns and were classified as '^'other*'* Table 3 also 5hows males 
as generally using more sophisticated strategies than those used by females. 
The distributions of the tv/o seices were ccmipared by assigning each subject 
a number corresponding to the number of problem type criteria passed (cell 
a » 1 , conditional probability '^^A)* A'_t t^st comparing males and females 
on strategy classification shows the sex difference in strategy use to be 
reliable (tClOl) «.2/6o, p< *.0U,^ 

A final judgmei^t-based strategy analysis compares the confidence ratings 
■ of subjects iit eafch of the strategy classifications* A subject strategy 
/ by problem type analysis of variance showed no significant difference as 
a function of subject judgment strategy CP(3',99) = 1,54, ns) , However, 
subject strategy did interact with problem type (F(9,297) - 2,68, £< •OUV 
In this interaction, subjects classified as ^ yarsus b^, sum of diagonals,, 
and conditional probability judges showed parallel decreasing confidence 
as problem dj-fficulty increased* However, cell a judges were least confident 
on ja versus b^ problems. As in the^pravious analysis, confidence ratings 
also showed, a main effect of problem type C_f(3,297) 28,68, £ < ,00O- 

Independent categorizations o^ Subjects' strategies were based on 

their responses to^the interview questions. First, subjects were asked 

to state their strategies (question la) and to demonstrate that strategy 

on ^ sample problem (qjnestion lb) , These two responses were consld^ed 

t<^gether and coded according to whether th&y conformed to one of our four 

strategies. Two alternative responses were also common. Several subjects 

described themselves as using a variant of the conditional probability 

§ 

strategy Whicfi compared ratios of cell frequencies u with ,cell frequencies 
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J, This siracegy would produce the same judgmencs as our conditional 
probability strategy and will be labelled cell ratios. A second common 
response was for a subject to say that he or she had just guessed. Responses 
chat did nop match any of these categories were labelled "other". All 
responses were independently categorized by two coders- These two raters 
agreed on B9% of their ratings. Table 4 illustrates these classifications 
of subjects' explanations. 

Once subjects had stated their strategies, they were shown a model 
of each of the four proposed strategies and asked to identify the one which 
most closely resembled their problem solving approach. This classification 
is referred to as model choice.. Frequency of choices of the various models 
is shown in .Table 5. Responses not represented in the strategy examples 
were coded as '*other"* Of these undasslf iable subjects, si:c said that they 
used more than one rule, knd the remaining subjects said that they used some 
strategy not listed in the models. 

Finally, subjects were asked to indicate the best strategy among the 
four examples* This response will be labelled best strategy. Table 6 
lists frequencies of subjects' choices of each of the strategies. The 

/ 

group categorized as^pther*' includes several subjects who thought that 
two or more categories were equally good ^ some subjects who thought the 
ceil ratio strategy was best) and some subjects who preferred some strategy ^ 
not listed in the ejcampies. 

As in the judgment-barsed strategy classification) a subject's strategy 
classification on each of these thre^ measures was converted to a scale 
score corresponding to the level of his or her classification in the 
strategy hierarchy* Since cell ratio judges should produce the same 
judgments as conditional probabij^ity rule .^s, these two rules wert grouped 

41 
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together in these analyses* Subjects who said that they guessed were 
given a score of 0* Comparisons between classif icarion methods were made 
in terms of these scale scores* The unclassif iable subjects were not 
included in these analyses* 

Correlations beD^een the various strategy classifications indicate 

some Gongruer^ce between methods* Xhe correlation between j'jdgment-based 

\ 

strategy classification and stated strategy is *58 (p < *001)* Classification 
of subjects by the two methods is illustrated in Table A* Comparisons 
between these classification systems indicate that differences between 
classifications by the two methods do not show a reliable direction 
(c(94) < 1, os) * A close inspection of Table A shows chat performance- 
explanation congruence differed according to subjects' strategy classification* 
Subjects whose performance patterns showed use of a conditional probability 
rule were almost imiformly accurate in describing their strategies* (97% 
of conditional probability subjects)* Among the other groups combined 

(excluding "other") only 24% of the subjects described rules congruent wich 

* 

their performance patterns, A comparison of the two groups shows this 

O 

difference to be reliable Cx^ = 45*46, df = 1, p < *00l)* 

Comparison between judgment*based classification and subject's model 
choice also showed reliable congruence between the two methods (r » *45, 
p < .OOlJ* Table 5 shows classification of subjects by the two methods- 
Comparison between the classification methods shows that model choices were 
neither reliably more nor less sophisticated than their judgment -based strategy 
classification Ct(98) < 1, ns) * The correlation betwee^i the strategy explanation 
and iHodel choice measure indicates some ;,greement between these two self- 
report measures (r = *53j p < *00l) w^th the subject classifications neither 
better nor worse by the two methods (t{99) < 1, ns) * 
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Finallyj subjects' selection of best strategy was compared to their 
classifications by other methods. Model choice and best strategy used 
the saire multiple choice method, and were thus deemed to make the best 
case for comparison (see Table 6 for classification by the two methods) . 
Subjects' selections of besc strategy were reliably more sophisticated 
than the strategy they identified as their own (t(88) « 5.35, p < .001)* 
suggesting that subjects recognized a better way to solve the probiems 
when one was provided. Their choices of best strategy were also mere 
sophisticated than their judgment-based strategy classifications (t(84) >* 7.19, 
p < .001) * 
Discussion 

^These results offer considerable evidence on relative congruence 
among self-report and performance-based methods* of identifying strategies 
underlying covariation judgments. All comparisons suggest some agreement 
between methods, with correlations ranging from ^^5 - *58. Correlations 
at this level indicate that subjects have some insight into their judgment 
bases . However, closer inspection of Tables 4 and 5 indicate that some 
subjects sho'^ considerably better insight than others. In particular^ 
conditional probability subjects (judgment-based cl ass if icat Ion) sre 
impressively accurate, with 97% describing a conditional probsbiiity (or 
cell ratio) strategy in their strategy explanation, and 84% selecting that 
strategy in the model choice measure^ In sharp contrast, all other subject 
groups show poor congruence betveen the performance-based and self-report 
measures, with ?A% agreement between judgment and explanation measures^ 
25% agreement between judgment and model choice. 

The strength of our judgment-baaed classification system is our ability 
to evaluate whether a stated rule would produce the obtained Judgment pattern. 
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A close irisp ection of Table 4 illustrates this comparison. For example, 
no subjecU with a cell ^ jud;,nient pattern described him or herself as 
using a coll a^ judgment rule* Our interpretation of this difference vould 
be ambigucus if these subjects described rules which "would produce a cell 
£ judgment pattern on the problem set. However > this was not the case. 
Half of ^hese subjects said they were guessing* an approach which would 
yield cell a pattern only 11 percent of the time (i.e,, the chance 

probability of producing the pattern). The remaining subjects with cell 

/ 

^performance patterns said they were using cell ratios, a strategy^ which 
would result in a conditional probability judgment pattern* Subjects 
showing a versus b^ patterns also showed yoor insight into rule use, with 
13 of A2 classifiable sjabjects describing themstive? as using rules which 
should produce more errors than they actually showed, and 11 subjects 
describing strategies which should have produced more accurate records 
than actually obtain^'d* Most of the subjects whose judgment performance 
indicated sum of diagonals strategy use described strategies that would 
produce conditional probability judgment patterns. Several subjects 
described themselves as comparing cells | with j, a strategy which would 
mimic a conditional probability strategy on the problem set* However, it 
is interesting to note that only one of the subjects who said they were 
using cell ratios produced a judgment pattern congruent with their described 
rule. As noted earlier* self-report and judgment pattern were congruent for 
conditional probability judges. In thess cases we are not simply noting 
relative agreement between performance and explanation. Our rule diagnostic 
problem set also allows us to sho^- whether subjects' self -reported rules 
would have produced their actual performance patterns. 
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One possible interpretation of poor agreement between Judgaient and 
explanation might be that subjects shifted rule use at some point in the 
problem set. A subject may have judged the initial problems by one 
strategy, but changed strategy by the end of the problem set. This 
individual's judgments might yield a classification according to the 
initial strategy, but he or she would be accurat^e in describing use of a 
different strategy to solve the lat^t problem. In fact, some of our 
subjects said th3.U Chey used more than one rule in response to the model 
choice question. This possibility may explain a few judgment-explanation 
discrepancies, but our rule classification system makes it unlikely as a 
general account. That is, a subject had to accurately judge at least two 
of the three problems of each strategy &ype to have passed criterion on 
that type. The problems were blocked such that one problem of each strategy 
type appeared in each third of the problem sequence. A subject would have 
to shift strategy after the eighth problem of the set to have met the 

4 

criteria for his or her initial problem solution strategy in the judgment- 
based classification. Shifts at other points should produce judgment records 
that do not conform to any of our strategy patterns. These subjects would ^ 
be labeled "other** and not be included in our method comparisons^ In fact, 
such 'inclassifiable subjects were infrequent in this sample (11.2%). 

These results show that agreement between different self-report measures 
is limited as well. The correlation between subjects* strategy explanation 
and model choice was a modest (though significant^ .53. Thus, the i.^^sue is 
not simply one of the validity of self-report of strategy use. Mechod of 
obtaining that self-report affects subjects* r'Jsponses as well. 

These comparisons suggest that self-report may be a weak data-base for 
research on covariation judgment. We note, however, that thsre may be conditions 
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under „hich self-reports would be more accurate. Our subjects described 
their strategies after solving a s^-ies of problems. Ericsson and Simon 
(1980) argue that features of memory and attention might predict that 
reports would be erroneous under these conditions. In particular, subjects 
must retrieve the relevant information from long term memory in order to 
explain their judgment rule. Potential sources of error include problems 
ia storing or retrieving the information from long term memory and incomplete 
reporting of the available information. Ericsson and Simon (1980) argue 
that such problems are minimized by gathering self-reports through a 
think aloud technique in which subjects verbalize their reasoning as 
they solve the problem. 

Although alternative techniques may improve self-report accuracy, 
our method is most relevant for comparison with past research In this area.^ 
In particular, Smedslund (1963) and Adi and 'colleagues (1973) each asked 
subjects to expUin their strategies after making several judgments about 
event covariations. Our evidence suggests that self-report of less-than- 
optimal strategies will be inaccurate under these circumstances. 

Considering covariation judgment as a problem in applied mathematics, 
our findings also have implications for educational assessment. ThaC is, 
self-report may be a poor method for diagnosing the sources of individual 
student's errors in applying ratio concepts. Our finding of strategy 
classification differences in self-report accuracy are somewhat Ironic from 
an educational point of view. That is. the students best able to report , 
their strategies would be those who need help the least. The success of a 
program to improve these judgments may „ell depend on the starting strategy 
of the individual involved. Our evidence indicates that student self-report 
is unlikely to yield an accurate diagnosis af sources of judgment error. 
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Our subjects do show some insight into the strengths and weaknesses 
of their chosen strategies. First, confidence ratings showed that subjects 
were less confident of their accuracy on problems where errors were high 
than on problems where <arror rates were low. Secondly, twice as many 
Subjects selected the conditional probability rule as the best rule as 
were classified as using the rule in problem solutions (32 percent vs+ 
65 percent)* One might wonder why subjects would persist in using a rule 
they knew was flawed* However, shifting rules requires that subjects be 
able to generate a better rule to use* This evidence indicates that subjects 
are better at recognizing good rules than at producing those rules on their 
own* 

A final consistent finding worth noting is the sex difference in 
judgment accuracy and strategy use* This sex difference is nbt surprising 
in the light of nmch past research showing males better than females in 
mathematical reasoning beginning ip junior high and continuing throughout 
adulthood (Maccoby & Jacklin, 1974)* Since the conditional probability 
rule builds so directly on comparisons of two ratios, we mil^ht expect sex 
differences in this judgment as well. Our method offers the additional 
advantage of identifying specific strategies employed by subjects of each 
sex- Compared to males, females were especially unlikely to use the 
conditional probability rule (19.3 percent vs* i6*3 percent), preferring 
the simpler and less accurate a versus b^ rule (41-9 percent vs+ 29,6 percent) 
This difference could have several possible sources. One likely source 
i$ simply that the two sexes came Co the experiment with differenc training 
backgrounds- Other studies have found maj,es and females co be substantially 
different in participation in math courses by the time chey get to college 
(Feriema, 1977; Keeves, 1973; Hall ^ Shakl^e, note I, ^^ational Assessment o: 

4/' 
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Educations^ Progrees* 1979), Further work would be required to assess the 
role of differential msth training in sex differences in Covariation 
judgments, '. 



In ovei^viow, our results indicate that subject's self-reports of 
covariation judgment rules show limited congruence with actual j'tidgiii^nt 



patterns, Se^f-report was an especially poor method for identifying 
sources of inaccuracy in judgment patterns^. Such effects of assessment 

method offer already explanation for poor agreement about strategy use 

\ 

in past studies^^ of covariation judgment* These results suggest that self 

\ 

report measures itre weak bases for drawing conclusions about strategy use. 

\ 

These problems wpLth self**report in covariation judgment accord well with 
other research showing poor correspondence between subjects* judgments 
and their explanations about those judgments. 
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FoPtnotes 

Partial ^support for this research was provided by NIE grant NIE-G- 
80-0091, Many thanks to R6n66 Smith for her help in collecting this data. 
Reprint requests , should be sent to Harriet Shaklee, Department of Psychology, 
University of Iowa, Iowa City, Iowa 52242. 

^We had some difficulty defining a noncontingent relationship for the 
sum-of-diagonals "problems* The problem we included (middle problem, column 
3, Table lA) ^T^viates ^ slightly from independ^ce (P(Aj^|Bj) - ^ -.06) 

by the conditional-probability rule* As a result we scored responses as 
correct if subjects concluded that was eicher less likely or just as 

likely as aJb^* The problem does discrimiilate appropriately bett/een the 
other judgment rules. Cell-a and a-versus-b judges should say that aJb^ 
is morfl likely than A^jB^, sum-of -diagonal ^ should say the ti^^o 

outcomes are equally likely. 



53 



Judging Event Covorlottous 
29 



A) Cell i re<iuoiic io& u£i;±<J' for each jirobleo) type 
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E) Strategy use and resultant 'patterns ot problem accuracy. 
{+ * accurate, 0 ^ Inaccurate) 

Problem^ Strategy Type 
Cell ^ Sum of Conditional 
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Strategy 
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Sum of 
Oiag^nala 

^ versua b^ 
Cell a 
Strategy 0 



a^ a versus b Diagonals Probjability 
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Mean S^jidj 



Table 2 

dgmenc Accuracy Per Problem Type 



sum o£ conditional all 

cell a_ a. versus b. diagonals probability types 

females 2.81 2.64 1.23 1.00 1.90 

males '*2.96 2.65 1.72 1.43 2.20 

all 2.88 2.65 ' 1.47 1.21 2.05 
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Table 3 

Judgment-Based Strategy Classifications 

(percentages) 

sum of conditional 

cell a a versus b diagonals probability other M 

males 3.7 ■ 29.6 11.1 46.3 9.3 54 

females 6.4 41.9 19.3 19.3 12.9 62 

ail 5.2 36.2 15.5 31.9 11.2 116 
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Table 4 



Frequencies of Strategy Classifications by Judgment-Based 
And Strategy Explanation Methods 



Strategy Explanation 



Judgment 
Based 

cell a 

_a versus b 

Sum of 
diagonals 

conditional 
probability 

o ther 



guess 
3 
9 



cell 31 
0 
2 



sum of 
^ vs _b diagonals 



0 
13 



0 
2 



conditional 
probability 

0 

1 



35 
0 . 



cell 
ratios 

3 

8 



other 
0 
7 



10 



all 
6 
42 

18 

37 
13 



all 



17 



15 



42 



28 



116 
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Table 5 

Frequencies of Strategy Classifications by Judgment*Based 
Arid Model Choice Methods 



Model Choice 

^ Judgment sum of conditional 

Based guess cell a^ a versus b_ d-^^gonals probability other all 

cell a 0 1 3 1 10 6 



a versus b^6^ 1^ 7 142 

2 2 2 1 9 2 18 



Sum of 
diagonals 



conditional 
probability 



2 0 0 3 31 1 37 



other 10 11- 6 4 13 

all U 7 20 13 57 8 116 
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Sest 
Scracegy 

cell 

3 versus b 



sum of 
diagonals 

conditio nai 
probability 



other 



all 



Table 6 

Frequencies of Strategy Classifications by Model Choice 
And Best Strategy Methods 

Model Choice 

sum of conditional 
guess cell a ^ versus _b diagonals probability 



0 
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1 
0 



0 

4 



11 



10 



20 



0 
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49 



57 



other 
0 
0 



all 
2 
7 



76 

22 
116 
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Questionnaire and table: Predictors of 
covariation judgment strategy use 
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JUDGING REUTI0NSHIP3 BETWEEN EVEJT^ SURVEY 



r 

Please answer each of the following questions to the best of your 
ability. Write your responses on the accompanying answer sheet . 



K How much math did you take during grades 9 and 10? 



1> Please estimate the quality of your performance In these 
courses? 



pcoi \ average excellent 

(Write the number correspondijng to your estimate on your 
answer sheet for each scale of this type). 



3. How much math did you take during grades U-'l2? 

4. Please estimate the quality of your performance in these 
courses; ''^ 

1 2 3 4 5 

poor average excellent 



5. Did you ever seek guidance from a high school counselor or 
counselors regarding election of math courses? 



6- Please Indicate the general attitude of any counselors consulted 
regarding your election of math courses: 

1 2 3 4 5 



very neutral very 

unfavorable - favorable 



\ 



7, Please indicate the amount of influence the counselor's advice 
had on your election of math courses: 



none sotae influence strong influence 



8- How many college mathematics and math-related semesters have 
you completed thus far? 



9. Please estimate the quality of your performance in these 
courses: 

12 3 4 5 

poor average excellent 



10, Have you sought guidance from your college advisor regarding 
your election of math courses? 

11, Please indicate the attitude of your advisor toward your electing 
math courses; 

12 3 4 5 



very neutral very 

unfavorable favorable 



12, Please indicate the amount of influence your advisor's recommen- 
dations had on your election of math courses; 

12^45 
none some influence strong Influence 



13, How many math and math-related courses do you expect to take in 
the future? 



14, Please indicate the amount of interest mathematics holds for you; 
12 3 4 5 



boring 



neutral 



interesning 



L5. Please estimate the usefulness of mathematical knowledge to 
your future career* 

1 2 3 4 5 



not at all 
useful 



maybe 
useful 



extremely 
useful 



16. How many semesters of logic have you taken? 

17. How many semesters of statistics or probability have you taken? 

18. Vlhat is your major course of study? 



L9. How favorable is your mother's atfltude toward your pursuing 
a college education? 

12 3 4 5 



very 
unfavorable 



neunral 



very 
favorable 



20. How favorable is your father's attitude toward. your pursuing 
a college education? 

I 2 3 ■ . 4 5 



very 
unfavorable 



neutral 



very 
favorable 



Thank you very much for y^>^t cooperative participation. 
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Table 1 



Correlation coefficients and 
number of observations (in parentheses) 
for Questionnaire data 





MATH 


ACC 


STRAT 


ACT-Q 


ACT-C 


INTRST 


USE AB^L 


ACC 


. 16 
(186) 














STRAT 


.04 
(161) 


.91*** 
(161) 












ACT-Q 


(186) 


.18 
(186) 


.15 
(161) 










ACT-C 


. 28*** 
(186) 


.17 
(186) 


.16 
(161) 


,81*** 
(186) 








INTRST 


(186) 


.15 
(186) 


.12 
(161) 


.35*** * 
(186) 


.24** 
(186) 






USE 


.43*** 
(186) 


.22 
(186) 


.20 
(l&l) 


.23** 
/ (186) 


.22* 
(186) 


,53*** 
(186) 




ABIt 


, 32**A 
(l{f4) 


.09 
(lOA) 


.03 
(90) 


,39*** 
(104) 


.33** 
(104) 


,58*** 
(104) 


.34** 
(186) 


ATT 


.28 

(54) 


.12 
(54) 


.04 
(51) 


.27 
(54) 


.15 
(54) 


.32* 
(54) 


.21 .24 
(54) (37) 



MATH: Math background 

ACC: Accuracy on c<>variation judgment problems 
^ ^ STRAT: Covariation Judgment strategy 

** ^ < 001 ACT-Q: Quantitative score on ACT exam 

o < 0001 ACT-C: Combined score on ACT exam 

^ ■ INTRST: Interest in mathematics 

USE: Usefulness of mathematics 
ABIL: Self-rated math ability 
ATT: Counselor's attitude toward math 
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Abstract 

Related research suggests that children may show some simple undf>j,^tanding of 
event covariations by the early elementary school years. The present experiments 
use a rule analysis methodology to investigate covariation judgments of children 
in this aga range. In Expe- iment 1, children in second^ third and fourth 
grade judged covariations on 12 different covariation problems. Children's 
perforrcance patterns on the problem set showed an increase in the use of 
systematic judgment strategies in this age range. Systematic rule users most 
commonly compared contingency table cells a and b in judging the event covariations. 
In Experiment 2, a training paradigm was employed to investigate possible 
origins of systematic rule use. First and second grade unsystematic, strategy 
0 and cell-a children were either directed to attend to ceills a and b (Attention 
only), were addicionally offered explicit instructions to note wnich of the^ 
two cells had more events (Attention-plus-more) or were givt^n no training 
(control). Posttest performance showed that the Attention-plus-more condition 
was Che only treatment to reliably elicit a-versus-b rule use. It is concluded 
chat simple covariation judgment rules can bf^. used by children in the early 
elementary, school year^. 
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Covariation Judgment: Systematic Rule Use in the Early Years 
Interest in children's causal reasoning has burgeoned in recent years 
(e-g,, Siegler, 1976; Bullock, Gelman Baillargeon,\982) - A number of 
theorists have suggested' that identification of cause-effect relationships is 
grounded in covariation judgment (e.g-, Inhelder & Piaget 1958; Kelley, 1972), 
That is, people search for causes of events by finding event covariates. In 
fact^ a few investigations indicate that children understand this link from an 
early age. For example^ DiVitto and McArthur (1978) found that children as 
young as fii:;Bt grade use summarized covariation information in explaining people's 
behavior- Siegler and Liebert (1975) » however, found that children were not 
influenced by event covariation until 8 or 9 years of age in their study of 
'children's explanations of physical events. Evidence of the/earliest use of 

and Hendelson (1975) , 



\ 



event covariation in causal reasoning is provided by Shultz 
who found that 3 and 4 year old cl^ldren showed a preference covariates 
when choosing causes of events/ Although the age trends diff-er in these studies > 
they concur in suggesting that preference for consistent covariates is an early 
developing pattern in children's explanations of events. 

Given this evidence, understap.dmg development In covariaTiou judgment 
wou Id c ritical to understanding children's causal reasoning. However^ 
investigations of children's abilities to make covariation judgments are'rare 
indeed. Those few stu*iJes which do exist show a degree of consensus on how 
children might judge event relationships (Tnhelder & Piagetj 1958; Adi, Karplus, 
Lawson it PuloSj 1978; Shaklee 6 Mims, 1981)- Tn the basic paradigm investigators 
.-offered subjects information on rhe frequency of c;?occurrence of alternative 
event states of two potentially related variables (^or example^ plants h*ea'thy 
or not healnhy; plant food present or absent), , Subjects were asked to identify 
the direction and/or strength of the relt^tionship b':tveen the events. In each 
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experijnent, subjects' covariation judgments and/or explanations of th*^ 
judgmerJts led the investigators to identify systaaatic but inaccurate rules 
which were precursors to the use of more mathematically sophisticated rules. 

Inhelder and Piaget (1953) proposed two simple rules of covariation 
judgment. In the first; an individual Vould judge a relationship accordingto 
the frequency with .which target event states cooccur (e.g., healthy plants 
which are given plant food in the example above^ cell a of a traditionally 
labeled contingency table. ?ee Table 1). A subject using this strategy would 



Insert Table t here 



identify a positive relationship between events if the cell a. frequency were 
the largest o£ the contingency table cells, and a negative relationship if it 
were the smallest (cell-a strategy). Inhelder and Piaget (1958) identified 
this strategy as common among younger adolescents. Smedslund (1963) and 
Misbett and Ross (L980) thought the strategy might typify adult reasoning as veil. 

Also proposed by iTihelder and Piaget (1958) was a second simple approach 
comparing the number times the target o^^tcome occurs with the supposed 
cause (or covariate) with the number of times it occurs without that cause 
(for eicample, healthy plants with plant food vs, healthy plants without plant 
food). This would compare contingency table cells £ and b^ (strategy a--versus-b) , 
This strategy ^^s identified by Inhelder and Piaget (1958) as typical of 
young adolescents and*\?as found by other investigators to be common among high 
school subjects as well (Adi, Karplus, Lawson and Pulos, 1978). 



\ 
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Inhelder and Piaget (1958) proposed a third strategy as characteristic 
of Comal operational thinking* That is> subj«!Cts would compar^i frequencies 
of events confirming (cells a^ and d) and disconf inning (cells and c) a 
relationship of a particular direction^ This rule would compare the sums of 
diagonal cells in the contingency table (sum of diagon^al^ strategy). 

Finally^ Jenkins andtWard (1965) propose that covariation is most accurately 
assessed by comparing the conditional probabilities of an event occurring 
given each of the alternative states of the other variable (e*g,> l>lant health/plan 

food vs- plant health/no plant food)* ^Ihis would comp^ire the frequency ratio 

^ a b 

in contingency tabXe cells — with that in ceils — ^ — (conditional probability 

_a _c b d 

strategy) , 

This analysis of possible rules may allow diagnosis of strategies actually 
employed by children of various ages, lhat is> different rules should pr'^duce 
different judgments on carefully c^jnstructed covarl;"*t:lon problems* A sec of 
such problems is illustrated in Tables 2.^. and 2b, Solution accuracy is indexed 



J Insert Tables 2a and 2b here 



by the dire^:tion of the judged relatiojiship (i,e, more likely given Bj > 
or 'no difference). Problems are structured hierarchically such that cell-a 
problems are correctly solved by all strategies, a~versus-b problems are 
accurately solved by all strategies eiccept cell-a* Sum-of -diagonals problems 
are accurately -^^"dged by sum-of-diagonals and conditicaal probability strategies 
and conditiona ^ability problems are accurately Judged by the conditional 
probability rul one ( ?ee Table 3)* The probability of matching the;?e 
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Inserc Table 3 here 



judgment patterns by chance aione is ,11 for cell-a, ,04 for a-versu , ,01 
for sum of diagonals, and ,005 for the conditional probability pattern. 

Shaklee and Mims U980 used this rule diagnostic approach to study 
covr-riation Judgment strategies used by subjects from 4th grade through college 
age. Subjects' judgment patterns in that age span showed a strong developmental 
trend, with the a-versus-b strategy evidenced by substantial numbers of subjects 
beginning in the fourth grade (29%), and sum of diasonals the oiodaX strategy at 
7ch and 10th grade (50% of subjects). Conditional probability patterns were 
produced by m^ny subjects at the 10th grade (27%) but were still used by a 
minority of subjects even in the college years (38%)^ Thus, this evidence 
supports previous investigators' suggestions that children may use simpler, 
less accurate rules as precursors to mature reasoning. However, these results 
deviated from previcus conclusions in cwo notable ways. First, che commonly 
proposed cell-a judgment pattern was rare among subjects at any of the ages, 
tested (0-8%). In addition, the level of mature reasoning most often fell short 
of the optimal judgment strategy. 

These results further contrast v^ith findings in the causal reasoning 
research wbere use of covariation informatioi^ was seen in causal judgment 
anyvhere from preschool co <J"9 years of age* Shaklee and Mims (19^1), on 
the other hanrli fii.d that nearly half fourth graders showed no systematic 
bases of covariation judgment. A look at the causal r:^asoning research indlc<ites 
that these studies ofiered cnildren a relatively easy task of covariation judgment, 
DiVitco and McArthur (1978), for ^.xample, summarized »he covariation information 
for the subjects, allowing children to u3e the infonpation in causal judgment 
when they might not be able to derive th^t information for them:ielves. 
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In the remaining studies (Shultz it Mendelson, 1975; Siegler ^ Lxebert, 1575), 
the target event and its possible causes were either perfectly contingent 
or completely Independent, Studies of covariation judgii<int> on the other hand> 
cominoniy ask for judgments about less-chan-perfect relationships. This analysis 
would indicate that young children m^y evidence a vfery simple understanding of 
covariation which does not hold up well when judging relationships of intermediate 
strength . 

A final related paradigm must also be considered in understanding -children's 
covariation judgment. That is, one commonly employed test of probability 
judgment is one in which a child is shown two piles of marbles composed of 
different proportions of marbles of two colors. The subject is asked to 
indicate the pile fi^m which he or sn. ^ould rather make a blind choice in 
order to obtain the marble of a particular color. The judgment is f ^rmaJ ly 
comparable to a covariation judgnicint, where a subject decides if a given outcome 
is more likely under condition or Siegler's (1981) rule analysis of 

children's performance in this paradigm shows systematic rule use by a narrow 
majority of 5 year olds with aost of those children using a rule comparable to 
the a-versus-b rule in covariation judgment research. By 8-9 years of -age 
a substantial majority of children were using systematic judgment rules, 
with 3 comparison of conditional probabilities the medal response pattern in 
Experiment L, a-ver3'^S"b the dominantly used rule in Experiment 2. Each 
experimt_nt found a comparison of conditional probabilities to be the most 
common r'jle among 12 year olr^s and adults. 

Thus, in contrast to covariation judgment research, Siegler found that systeinaci*:: 
rule use in a related judgment occurs an earlier age, culminating in use of the 
optimal rule by early adolescerce. Sie^;ler:'s (1981) findings may suggest that Shaklee 
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and Mims (1981) provide a conservative estimate of children's acquisition of 
systematic bases o£ covariation judgment* Causal reasoning research ;iIso 
indicates that some simple unders tan't? ing of event covariation may be :;een by 
the early elementary school years. Possible resolution of these differences 
may begin with a careful look at the covariation judgment paradigm. The reliable 
strategy use evidenced by older subjects clearly indicates that they understood 
the experimental stimuli and procedures* However, among the fourth grade 
sample, 25% of the subjects produced unclassif iable response patterns, 'and 
an additional 21% passed no strategy criteria at all* This hig'^ r. e of 
unsystematic responses may indicate that a substantial group of these childtjen 
were confused by the paradigm and thus, unable to demonstrate systemaclb rules 
which may be in their repertoires* If this were the case, a simplified approach 
should be developed to test these younger subjects. 

We address the question of early covariation judgment in two ways* 
Eitperijnent 1 employs a simplified paradigm to ei-tamine the development of 
covariation judgment rule use among young elementary school children* Once 
these normative trends are established our second study investigates sources of this 
shift to systematic rule use^ In Experiment 2, we test information components which 
may be sufficient to elicit rt^iable rule use among young children* 

Experiment 1 

Simpliticatlon olf our previous experimental procedure was accomplished in 
two major ways^ First, ^era concerned that your.ger subjects might not 
understand the stimuli represented m the 2x2 table. As a result, a new 
introduction expanded the discussion of the contents of thft table, d^^ing the 
.subject point to examples of each or the four possible combinations of 
tivenc ijtat<e^ in tht^ table* 

Secondlv, ve suspected that uur pr*5vious question forrjat mi^^ht be ov*^rly 
c^mpl*::.^ f'^r the younger childrd^n. The previous question .iiiked (lu th«^ pJ ifiC 
r-^'^d exampl'i di:^<:us^ed Above), 
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When they got special food, plants were 

a) moTQ litcely to be healthy than 

b) just as likely to be healthy as 

c) less likely zo t>o healthy than 
when they didn't get special food. 

A raronaulated question offered simpler syntax^ 
Plants were more likely to gei: better if 

a) they got the special food 

b) they did not get the special food 

c) no differer^Cft 

We expected that this simplified question would be more appropriate 
to the language competenOfes of younger subjects. Experiment 1 also included 
two different problem sets in anticipation of our needs in the subsequent study. 

Method 

Subjects 

Subjects in the experiment were respondents to an advercisement in a small 
tovn newspaper of f ering^payment to second, third and fourth grade children for 
participating in a psychology experiment. The resultant sample included 37 second 
graders^ 18 third graders, and 17 fourth graders. 
Problems 

I 

^Subjects judged one of two sets o£ 12 covariation problems, each structured to 
produce a distinctive pattern of solution accuracy by each of the four proposed judgment 
rul'^s. In one set of problems, cell frequencies tot^.led 36 for each problem (set 2A), 
in the other ^et, cell frequencies totaled 36 for each problem (set 36), Except for 
these frequency differences, th*^ two problem sets verei identical in other respects. 
Tables 23 artd 2b show the actual problem frequencies tised for the problems in ezch of 
the tvo sets. The 12 problems in each -:et included thre^i problems for each of the four 
strategy v/pe3. One noncontlngent (middle row Tables 2a and 2b) and two contin^^ent 
reUtioaships (top ^nd boctom rows Tables 2a and 2b, P(Aj/Bp - P(A^/B^> = (.^0 uo .50) 
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were includ^id for each problem strategy type. Table 3 shows the pattern of soliition 
accuracy congruent with each of the proposed rules. 

Each problem was set in a concrete context of two everyday events which 
may or may not bt related* Each individual event pairing was illustrated with 
a small picture showing the state of the two variables (e*g*j plant sick or 
healthy/plant food present or absent)* Three problems pictured bakery products 
which either rose or fell in association with the presence or absence of 
yeasty baking powder » or a "special ingredient". In three other problems > 
plants were pictured as healthy or sick as a possible function of the presence 
or absence of plant food> bug spray, or a "special medicine"* In three problems 
people or animals were ^pictured as sick or healthy as a possible 
function of th^ presence or absence of a shot, liquid medicine, or a pill* The 
three remaining problems pictured a possible association between space 
creatures appearing happy* or sad in the presence or absence of one of three 
weather conditions (snow, fog, or sunshine). 

For each problem, data instances were organized in a 2 x 2 table. In 
each case> the manipulated factor (or environmental event) defined the table 
columns (e,g,, plant food, no plant food in example below), and the outcomes 
defined the table rows (e,g,> plants healthy, not healthy ia the example 
below). Each problem was introduced with a paragraph describing a context in 
which several observations were made on two potentially related variables. 
Subjects were asked to look at the pictured information and to identify the 
relative likelihood of one of the events when the second event was eicher 
present or .absent. An example problem follows: 
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A plane grower had a bunch of sick plants. He gave some of chem special 
plant food, but some plants didn't get special food* Some of the r^ants 
got better but some of them didn't* In the picture you will see how many 
times these things happened together* The picture shows that the plants 
we're more likely to get better if: 

A, they got the special food* 

B, they did not get the special food* 

C, no difference (they were just as likely to get better with food 
as without the food). 

The 12 problems were grouped Into problem blocks^ including one problem 
from each strategy type. Problems within each block were arranged in a single 
random sequence. The three problem blocks were sequenced in a single random 
order* ^fumbers in parentheses to the left of the problems in Tables 2a and 2b 
Indicate the position of each problem In the problem sequence. 
Procedure 

Each subject was tested individually* Introductory instructions introduced 
the subject to the concept of covariation in the context of "things that go 
together", Naturally occurring examples were given of positive relationships 
(i*e*> tall people are more likely to be heavy than short people) > negative 
relationships {i.e,^ it is less likely to rain when it is sunny than when it 
Is cloudy), and unrelated events (i*e*> a green truck is just as likely to run 
uut of gas as a red truck). Subjects were told that they would be given some 
problems about hypothetical event;5 that may or may not tend to go together. 
Two sample problems were used to clarify the Information in the 2x2 table. 
The first sample problem was read to the subj£*ct, Th€i subject was told rhan 
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pictures in the c^lls showed the occurrence or nonoccurrence of the two events 
in the story. The ^perimenter then pointed out that each cell represented a 
different combination of the two, possibly events and stated what these were. 
The Subject was asked to point to cells corresponding co specific combinations 
of events*" given by the experitaenter* The experimenter explained that each 
picture in the cells represented one occurrence of a particular combination 
of events, so that the number of pictures in each cell represented the number 
of times that combination occurred. The experimenter then read he covariation 
question to the subject and asked him or her to answer It based on the events 
pictured in the table. It was emphasized that subjects should answer the 
questions based on what t.ad occurred in each story problem and should avoid 
basing answers on knowledge of common everyday occurrend'es (for example, 
that it is more likely to snow when it is cold, regardless of cell frequencies)* 
Each subject gave a solution to the problem and repeated the procedure on 
the second sample problem* Subjects were encouraged to ask any questions 
they might have about the task* 

The subject then proceeded to the 12 problem set. Each of the problems 
in the set were read to the subject by the experimenter. Subjects were allowed 
to answer the problems at their own pace. 

Results 

Our main interest in this study was to establish trends in strategy 
use among these younger subjects. As a result, the analyses in this study use 
subject strategy classification as the dependentf'variable of interest. 
Subjects were classified for strategy use according to the method illustrated 
in Table 3* A subject was said to have ^'passed" a given problem type if he or 
she was accurate on two or more of the three problems of a giyen problem type. 
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A subject who met this criterion on all problem types vould be classified as 
a conditional probability rule user, subjects who passed criteria on all types 
except the conditional probability problems were labeled sum-of -diagonals 
judges* A-versus-b judges should pass the cell-a and a-versus-b problems^ 
but not the other problem types> cell-a rule users should pass criterion on 

* 

cell-a problems alone. Subjects who passed no problem types were labeled 
Strategy 0; all other judgment patterns were categorized as unclassif iable. 
Table 4 shows the rule classifications of subjects In each of the three grades. 



Insert Table 4 here 



The modal classification at each of the grades was a-versus-b, with very few 
subjects showing evidence of use of more sophisticated rules and a few subjects 
at each grade with cell-a rule judgment patterns. Many subjects in the second 
and third grades made judgments that were not classifiable by any of our rules* 
Effects of grade level and problem set were ejcamined by assigning subjects a 
score according to che number of problem type criteria passed. Thus, Strategy 
0 subjects were assigned a score of 0, conditional probability subjects a 
score o£ 4. Unc^asslf lable subjects could not be clearly ranked in this way 
and were excluded from these anal>ses. Data from the remaining subjects were 
analysed in an analysis of variance with subject's grade (2, 3, or 4) and 
problem set (24 or 36) as factors. These analyses showed a significant effect 
of grade, F (2^51) = 3.30> p < .05^ with third and fourth graders similar to 
each other, and classified as using more advanced rules than the 2nd graders 
fDuncan*3 muitif>le range test, p < '^'^ ^ Problem set effects were not significant 
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Discussion 

Related research in causal reasoning and probability judgment indicated that 
children might show some simple understanding o£ event covariation by early ei^^iieotary 
school. This experiment found that ^ majority of children do show .systematic rule use 
in* covariation judgment by the second grade. Significant age trends also show an 
increase in systematic rule use with age in the second to fourth ferade age s^pan. Rule 
categorizations in this age range show a substantial decline in unclassif iable. and 
Strategy 0 subjects with increasing age and an Increase in a-versus-b rule use. 
However, use of more advanced rules* was rare at ail ages t^sted^ 

Comparison with Shaklee and Mims (1931) indicates that subjects did indeed 

show earlier competencies vith our revised procedure^ Nearly all fourth graders 

were classifiable by one of our proposed rules in the present experiment and 

a majority of children showed systematic rule use in the second and third grades ► 

Overwhelmingly, these children were classified as using the a-versus-b rule. The 

low frequency of loore sophisticated strategies is comparable to that seen in 

our prior research. Also, similar to our past results is the low rate of 

usage of the cell^a strategy. This is especially interesting, given that it is 

t 

tl^e most conanon of the proposed judgment strategies and was even said to be 

the modal strategy among adults (Smedslund, 1963; Nisbett S Ross, 1981). 

Our evidence finds this strategy to be rare among children as young as second grade. 

* 

These results would indicate what our prior procedures may have been 
unnecessarily confusing to you^.ger subjects. Our prior and present procedures 
were not systematically (Compared in this paradigm, nor did we compare aspects 
of the changed procediare (a.g.^ instruction vs. quesci'^n format) in a factorial 
design. As 3 result , we can offer little information aboat what aspects of 
the prior procedure nsay iiave been a problem. .However, it ±s clear that we 
hav« developed a procedure suitable for use vith youog children. These findinsrs 
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indicate Chat children as young as second grade use simple but systematic rules 
in judging event relacio'^^hips. 

Age trends in this paradigm show origins..of rule use in covariation 
judgment at age levels comparable to that of researchers in causal and pirobabilistic 
reasoning. However, in oae irespect, these results differ from Siegler's . 
(1981) data on children's probability judgmeats* In those experiments, siubstantial 
numbers of ciiildrea used the conditional probability rule by 8--9 years. In 
fact, a comparisor of conditional probabiliti'ss waa the niodal response pattern in this 
age group in one of his experiments* In contrast, none of the subjects in 
this experiment was classified as using the conditional probability rule and only a 
few used the sum of diagonals rule. Our past research CShaklee & Tucker, 
1979; J^haklee & Hiaas, 1982) found the conditional probability rule to be used 
by only a minority cf subjects even at adulthood. Ttius, comparability between 
these paradigms in terms of early rule use is not matched by performance 
similarity in the later years. Expressing a judgment in terms of marbles in 
piles elicits more advanced rule use than a question asking for a comparable 
decision in terms of covariations between potentially related events* One 
difference may be that our problems are set in contexts of ivents that are 
readily interpreted as causally related* Adi and colleagues (1978) found tl 
subjects used simpler, xess accurate rules i^i evaluating cause^eff^ct relationships 
than in making covariation judgments on analogous problems. Evidence such 
as this may indicate that covariation judgment in a causal context lags behind 
the same judgment about non^-causal relationships. 

Our evidence of systematic rule u-se at an early age is intriguing, but 
et^uivalentiy interesting are the unsystematic judgments of so many age peers. 
That is, at secoad and third grades a majority of children are classified by 
one of our rules (39.'-; 3ad t)l^ respectively), W.t a substantial minority in 
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each grade produce unsystemacic j.udgment patterns {19% and 39% respectively) 

or pass no problem type criteria (Strategy 0 = 22% of second graders). 

Inspection of individual subjects' judgment patterns failed to identify any alternative 

strategic bases of these responses- Thus some children are unsystematic in * 

rule use at the same age as other children begin to show use of simple 

judgment strategies. What did these rule users know that allowed them to judge 

the problems in a systematic fashion? Several factors may differentiate these 

rule users from their unsystematic age peers. 

One possibility may be that unsystematic subjects are not using the tabled 
frequencies at all, but rather are judging the event covariations on the basis 
of their prior expectations about the event relationships. For example, such 
children may decide that plantg^re more likely to be h^^^hyj^j^i^n chei? get 
plant food based on their real world experience, regardlepA of the event 
frequencies in the problems they are asked to judge* Ouk instructions already 
caution subjects against making expectancy-based judgmentsybut those instructions 
may be readily forgotten as the subject solves the prtobj^s. 

Expectancy-based judgments may V)e a source of unclassif iable response 
patterns, but what leads others of these young subjects to adopt an a-versus-b rule? 
We suspected that the judgment question itself may direct children's attention 
to cei'^ls a and b of the contingency table* Asked if plants are more likely to 
be healthy when they get plant food or when they do not get plant food, a 
Subject may look at these two event conjunctions {i*e, , healthy plants-plant food, 
heaichy plants-no plant food), A subjects must also attend to the comparative 
aspect of the question in order to employ t'Fie a-versus-b rule* Mastery of either 
t^-' .::tt5ntion dir<2ction or compai-ative aspects of the judgment (or both) may be 
ki^y compectincios underlying the shift to a-versus-b rule use at these early 
.is^es. 
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These are plausible sources of development in covariation judgment, but 
their roles in the origins of systematic rule use have yet to be demonstrated. 
An approach often employed to model a naturally occurring developmental crend 
is a training paradigm. That is> one might identify a training program which 
teaches non-rule users the knowledge said to differentiate those subjects from 
rule-'based age peers. Contents of a successful training procedure identify, at 
least one sufficient model to account for the natural transition to systematic 
rule use . * * 

Experiment 2 

We propose to use this training strategy^ in Experiment 2 ^o investigate ' the 
origins of systematic- rule use in judging event covariation* Results of 
Experiment 1 indicated that reliable rule use was already becoming coimnon xn the 
seco/d grade sample. Thus> Experiment 2 was an attempt to train first and second 
grade* children to use the a^versus--b rule. We chose not to train children in use 
of the cell -a rule since it so rarely occurred naturally. 

If young children's judgments are unsystematic because they ar » expectation-based , 
this problem would best be treated by drawing children's attention to the 
frequency infcmation in the tables. Thus, one training procedure directed 
children's attention to the frequericies involved in the a-versus-b rule, i,e., 
cells a and b. This was the reasoning behind the Attention-only condition, where, 
on a set of 6 training problems, the experimenter asked the subject to point 
to the event combinations specifically mentioned in the question and to count 
the number of cases in each of the r.wo cells. Subjects then made their covariation 
judgment . 

As suggested previously, a subject may also fail to use the a-versus-b 
rule because he or she misses the comparative aspect of the question i,e., which Is 
jnore^ Ll'itelY, a second group of subjects were given the Attention instructions on 
the training probiems and, in addition, '«ere specifically asked which of the t'^o 

8i 



Covariation Jud^menc 
17 

cells had more cases In It, Subjects then made th^lr covariatJon judgments. This 
' group is the Act en c ion-plus -more training group* 

A final group is a no-training control group, who judged the same i 

\ 

problems but vere given.no special instructions* ' 

All subjects were pretested to establish initial rule use* Unclassif iaole > 
Strategy 0 and cell-a judges were included in the paradigm. Subjects were 
randomly assigned to one of the thifee conditions* Training effect^ wer^ Measured 
in a posccest given about a we'ek after the traaJhing session, In view of their 
comparability in Experiment 1> -problem sec 24 and sets'^B were problems in this 
experiment* ' , . ' 

Method ^ 

Subjects 

* ^ ^ 

Subjects were respondents to ads in a small tpwn newspaper offering -first 

I 

and second graders payinent for participation in a ^isyciiology experiment. Forty- 
nine subjects participated in the pretest session of the experiment^. However, 
13 subjects were dropped from the experiment because ^^heir pretest strategy 
indicate^^x^at chey were already using the a-v^r§ys-b (9 subjects) or ^ more 
^^advancedVstrategy '(3 sum-of ^-diagonals subjects, 1 cOnditional^prolyability 

Subject)* The remaining 36 subjects (18 males and 18 females) included 13 

* 

unclassif iable, 17 Strategy 0, and 6 cell-a subjects. Mean age of these 
subjects '^rfas 7 years-6 months (range 6 years-10 months to 8 years-O months)* 
^jenest 

Problems and instructions on the pretest were identical to those described 
in Experiment i. Half of the subjects were given pi;oblem set 24 for tho 
pretest and iiec 36 for posttest^ the remaining ^subjeccs were g^en the probleD 
aets in the reverse se,quence* * 

tlince the problem set vas compleced^ the experimenter determined the 
subject's judgment strategy in the manner described in Experiment 1< 
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Train lag 

Six new problems were deveiopei for craining. These problems used cell 
freque-^cies and concancs which were different from those uced in the two test 
sets. Subjects classified as cell-a. Strategy Oj or unclassif iable were 
randomly assigne<^ to one of three training conditions (12 subjects per condition). 

Attention-Qnly . This training was designed to direct subject's attention 
to tht two event pairings specifically mentj.oned in the question (i.e., cells 
a and b) . Verbatim instructions for this condition were as f.,llows (portions 
were re-phrased if necessary) : 

In doing these problems, you may have had a certain way of deciding which 
answer you thought was right. For example, ^ou may have thought that 
certain boxes and the pictures in them wer^ important and other boxes 
were not important in answering the question. Or you nay have compared 
certain boxes with eaca other. Jf on^ thing happened more than another 
thing, it may have been more likely to happen. Now we are going tc see 
if there Dight be another way to solve these problems that may be better 
than the way you used. We will try to decide which boxes and the pictures 
in th^.^ are important in deciding which answer is right. X want you to 
think hai'f nuw about a good way to answer these problems. I'll a^ik you 
some ques^^on^ to help figure out a way to decide what answer is right. 
(The first problem and question were read to the child.) 
If we waated to decide urhich answer is right, it is important to look at 
each answer and find good examples or pictures that show thac thing 
happening. For example, let us suppose we wantf^d to see if answer a 
Tii^ht be the right auswer. .\nrwer A ^ lys (e.^., the bugs a.e more likely 
uo era I on tht^ leaves when it is ::iurny out), Ceuid you show me which 
b*^:: or pic:...res are good examples of that? Uliich pictures ^how wher-^ the 
(bugs croiwL on the leav^^^ wheti It is sur^ny out)? 
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(Subjects should point to cell and were corrected if they did not. 
When subjects did point to ceil ) 

Righc. Can you tell me why? So these pictures show the (bugs crawling 
on the leaves when It is sunny out). This I5 important box to look at 
ir deciding if answer A is right. And how many times did thar happen? 
So th^re are good examples of answer A- 

(The experimenter also poirted to other cell«, asked or pointect out why 
they were not good examples.) ^ 

Now let us look at answer By because that could also be the right answer. 
(The same procedure was repeated. Subjects should point to cell b^* The 
experimenter selected answer A and answer B to be discussed first with 
approximately equal f requeincies . The discussion was then summarized*) 
Okayj so that n'^ns that if we wanted to see if (question w^.th answer A 
is read) tnis box (cell a^) and the pictures it would be important to 

look at* ^nd we see that it happened times. If we wanted to see 

if (question with answer 6 is read) this box (cell b) and the pictures in 

it would be important to Idok at. And we see that this happened 

cime^. It is also possible that answer C is correct, that it didn't make 
an> difference (if it was sunny or not, the bugs were just as likely ro 
crawl on che leaves) 

The covariation judgment question was then read to the subject and he or 
she made a response. 

Attention-:>Iu^-more . This training conoition was designed to e^phasiza 
? comparative aspect of the question^ i.e., which outcome v:as mor^ likely? The training 
builds on the Atcention-only tr^inin^ described earlier. Subjects in this condition 
hejird all -ji the Instructjoni; in Che Attention-only training, and were then 
-^aked to makd ^ ^rect cOr tri^on of cell a and cell b frequencies ("Which ot 
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chese tvo things happened more?"), the experime^cer then read che 
covariation judgment question to the subject and he she made a response. 

Cont rol . Subjects in this condition judged the same problems as subjects 
in the other groups, b^jt were offered training instructions. 

In each training condition, the procedure described was repeated on the 
six training problems. Feedback (positive or negative) was not provided 
following the subjecr/s answers to t:he covariation judgment question. 
Post cest 

Subject fatigue prevented an immediate posttest of training effects. 
However, all subjectis did return approximately one week lat*r for a delayed 
posttest. This posttest was administered by a second experimenter who wa.<5 
blind to che training condicion of the subject. experimenter fir^t 

reviewed the stlmiiius macerials and problem format by presenti\ig one of the 
i>ample problems usied in session L. Following chis^ the second problem set was 
administered! in the same manner as in session 1, SuSJects were tested on 
the Droblem set (24 or 36) net j'idged in the pretest session, following 
completior. of che problem set, subjects were told the purpose of che experiment 
and its potential relevance to everyday causal reason5.ng. 

Results 

The first: indication of che relacive success of che training mechocii; 
uas childTtin's performance on the 6 training problems. Subjects responded in 
the manner predicted by che a-versus^b rule oct 43, of the problems in che 
Cont^vi gr^apj 72.2.* of the problems in the Atcention-onJy group^ and 
97.*"" of cne ':>roble(n=> m cbe Atteacion^plus-more group. An overall analysis 
of variance ::^du:^tes the^f^ differences co bt^ re.'^able, F (2,33) - 18. 8U 
p - .001. Pjir^i^^-j conip=*r i^^ons indicate trta^: ^Lich training firoup is 5;j^nitican 
dit:"r?rent trjtti ^.i^h ot the other groups (Duncan'^ miiltiT>l^ rany;e testy 0^). 
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Eftectj of the training procedure are Tnost clearly assessed by comparison 
oi Che postcest pertcraanc^^ of subjects In i: ' training and coii^-rol conditions. 
These effects will be analyzed bj',h In tenns of Che accuracy of subjticts on 
the various problesi types and in terms of their postcest stracegv classifications. 

For *;ach subjectj posccest .^v'.lgmenl accuracy was assessed m temis of the 
percentage of correct judg!nent$ for each of the 4 problera types. These data 
were c:naly 'ed in an analysis of variance including problem tvfe (4 levels) and 
subject's training condition (3 lavels) as factors. This analysis indicated a significant 
main effect of proble>3i type, F (3,99) = -p < .001, and a significant 

interaction between problem type and training condition, F (6,99) = 5,78, p < ,001. 
As the means indicate in Table 5, Actention-plus-more subjects were \ 



Insert Table 5 here 



substantially more accurate on cell-a and a-versus-^b problems than on sum of 
diagonals and conditional probability proDlems- Attention -only and control subjects* 
performance vere similarly poor across problem types. The m^in effect of 
zramm^ condition was not significant, 

Pr^ttis: and postcest strategy classifications yere compared for each 
subjec: to note training effeet:^. Judgment was said to have Improved if a 
subject vas classified as using the a-versus-b, sum of diagonal'^, or conditional 
pr^^babiiiwV s::racegy postte^z, Tabl^ ^ indicates the fre^^uencie^ of improvement 
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An ,overall ^hows ch^^e training eff^icirs co be- signiticancly different 
betvetan coaditions (\- - 11,02, df = 2, p < ,01). As indicated in the table, 
rates ot inprovem^nc wer« at similarly low levels (25^0 in the control and 
Acc^intion-only conditions compared with substantial rates o£ improvement; (83rO 
among AttKntion-plus-raore subjects. 

I Discussion 
These results offer clear evidence of the ditferential effectiveness of 
our various training conditions. First, spontaneous improvement from test 
to retest was rare among subjects in the control condition. This would 
saj?^gest chat these yovng subject's problems were not simply lack of familiarity 
with the problems. 

Improvemetit rat:as were equally low in ^-he Attention-only condition. This 
null indicates that simply directing attention to cells 3 and b is not 

sufficient to elicit a-versus-b rule use among these children. The failure of 
Attention^oniy instructions m^y imply that subjects at this age already know 
ho'*" to find the cells mentioned In the question. If this were the case 
control and At t ent ion-only subjects would be essentially equivalent In knowledge 
^rate at postcest. One uouid also exp*^ct that the Attention-only training wo .i 
bt^ surficierit to overcome any tendency to ^nake expectations-based jud?tQe;nts. 
ThJL iB., L^hiMrtn':^ attention v-i5 repeatedly directed to the infor:uat:on in 

ti^Dir* c^H'^. tnit^tid^ tne ch"*ldren"s mproved perfcnfiance on the training 
;>rocleiTi.- sa*^,^v^t^ ti.^ t the trsiLning was succe-^sful in elicK^ia^ f requency-ba?ed 
ji^^ru:nr,.. Hoveve;:, t ■ ir\ w^ri^ net Tn.jtintaint>d at the posttest one 

'^K'^.- .^'.urr:!-', anv \\ '-^'f^ct ' . jt 1^ .^t one alternative iiUerpr^--t.K 

W-'i-^ I , ' ' , / Lr.un.'s,-, ..^^:.d;:loa ;ii "] ' "hsiVc ,-5i'!^p)y brf^n i .e t : ol. t ivf 
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However, the Att^ntion^pius-more craining did result in reliable impfoveTDenc 
Che p- ntest, this finding indicates that the comparative aspect of the 
judgment m^y be a key obstacle to natural use of this simple rule by young 
Subjects. Although they may know th-it tvo cells of the table are relevant, 
apparently subjects this young cannot spontaneously derive a way to combine 
that information to make a single judgment. Our training in the "more" rule 
apparently offers them that Information. Since this training builds on che 
information offered in the Aitention-^nly conditionj this effect may hinge on 
the combined influence of the attention direction and comparative aspects of 
the question, unfortunately a "More-only'* condition is logically impossible. 
One cannot talk about compering cells wichcuc designating which cells ac^ to be 
compared. The fact that these training effects hela over a one week delay 
period indicates the reliability of knowledge che children* acquired. 

Fmailyj it is worth noting :;he specificity of our training effects! That 
iSj all children who improved in strategy use showed use of the a-versus^b 
strdtt^gy. This aspect of che resuJcs indicates that subjects were not simply 
learning to be systematic in judg-ment bases. Rather, they acquired one 
specific judi.:mtint rule. On this problem sec, use of the a-versus-b rule did 
not lead to an overall improvement in judgment accuracy. This is by desi;5n 
of the probl^^m se-t. That is, a-versus-b judges should be corre<:t on celi^a 

a-verc-^us-b problems but mcarrect on thft sum o^ diagonals and conditional 
probI>:^{rs. Thus, the ;>ucct^5sful Attention-pius-move training actually results 
in worii--^ p^rtoin^r.c^ of hal^ of the probl^^ms compared to the other tvo conditicn,s. 

T^tr.^t: tr.iiiriirjg erfeci? oiler one suffis^ient model of the natural proc^-:s 
n .K"^..:r.rij '^he -^vt^ r:-u ^ -b rule. TS.it is, iiubject^ wht^st^ ,itt.irition va:^ 
J ; r-'C :i ■ [ 1 > 1 H^nd h ..Mid vhi^ vor Ln^c rut, t ed to comp.ire t ht tvo oe 1 1 ^ 
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showed a-verisus-b rule use. Thus, these tvo knowledge components may be the 
source of children's natural shifts to a-versus-b rule use. Of course, a 
sufficient process is not always a necessa^-y one- That is, children may 
spontaneously discover the rule through yet another sufficient process. 

These training effects may also be appreciated in a broader context. 
That iSj research in causal reasoning indicates that some simple understanding of 
event covariation may begin in early elementary school (Shultz ^ Mendelson 
1975; Siegler fit Liebert, 1974). Slegler's (1981) work In probability judgment 
shows similar age trends in children's use of simple rules in comparing 
probabilities. This evidence indicates that these competencies may be shown 
at an even earlier age with a brief training procedure. It may be interesting 
to see if these improveme: tis in covariation judgment would influence children's 
causal reasoning as well. This may be a domain in which to test children's 
ability to apply statistical concepts appropriately to related judgments. 

^^^hether children could learn to use a more complex rule with appropriate 
training is a question for futurt; research. However, the level of math 
involved in our other rules may preclude their use in early elementary school. 
The s'isi of diaj^onals rule requires a comparison of two sucoSj the conditional 
probability rule compares two ratios. These advanced arithmetic competencies 
are ijkely to be outside of the capacity of such young children* 

In overview*,', these t^^-o studies offer new information about covariation 
jud^ent in the early elementary school years. That is, many children 
spontaneously snow use of che a-versus^b rule as ^arly as second gride. 
Children as young as first grsde can taught lo use Lhis simple rui^ if 

:ert?d ciTie r^ilev.mc m tor^iu^t ion* Vl^ is t ra;o inj evident^i? of t ^rs orit-> -^uii ic i-'M" 
oi th^^ natural acquisition of a ^inipie rule f :r juc^iin? 'C^t^Ia' on^^rnp^ 

8^ 



Covariation Judgment 

References 

Adi, H., Karplus, R, , Lawson, S., i, Pulos, R. Intellectual development beyond 

elementary school VI: Correlational reasoning. School Science and 

Mathematics ^ 1978, 7^, 675-683* 
Bullock, M., Geiman, R., i Baiiiargeon, R. The development of causal 

reasoning. In Frieaman The Developmental Psychology of Time . 

Mew York; Academic Press, lS8i* 
DiVitto, B. , i McArthur, Deveiopmencai diif^rences in the use oJf 

distinctiveness, consensus, and consistency information in making causal 

attributions. Developmental Psychology . 1978, U,, 47^-482. 
Inhelder, B., & Piaget, J. The Grovth of Logical thinking from Childhood 

to Adolescence . Hew York: Basic Books, 1958* 
Jenkins, H., S Ward, Judgment of contingency between responses and 

outcomes. Psychological Monographs . 1965, 79, 1-17. 
Kelley, H. AtcribuLion theory in social interaction. In E. Jones, er aK , 

(Has.), Attribu tion: Perceiving the causes of behavior . Morristown, 

NJ: General Learning Press, 1972. 
\^isbett, R., ^ Ross, I. Kmnan Inference: Strategics and Shortcoming s of 

^PS;}^^ J-.J gmen- . Englewood Cliffs, NJ: Prentice-'Hall, 1980* 
Sh3kie^>> H., ^ Mxms, M. Dev^ ] -^-ri^^n t or rule use in judgments of covariation 

b-jtw€-"±n r!v^nz^. v.nild Developmen t, 1981, 32, 317--325. 
Sl.ultr, T., or M^ndel^^oxi, R . The .^f covairUrion as a principle of caudal 

P. D*^:in::t^ Lne lor.u-r or" d-ve Ilj ^ - r nta 1 d'.fference in chiLldr^-rs 



Covariation Judgment 
26 



Siegler, R, Developmental sequences within and between concepr:s. Monograph 
of the Society for Research in Child Development , 1981, j46. Whole N'o, 189, 

Siegler, R, , & Liebert, R, Effects of contiguity, regularity and age on 

children's causal inferences developmental Psychology , 1974, I0_i 574-579, 

Smedslund, J, The concept of correlation in adults, Scandinavian Journal 
of Psychology , 1963, 4, 165-^173, 



Covariation Judgtnenc 
27 



Footnoce 

^We had some difficulty defining a nr-^cont ingent relationship for the 
sum of diagonals problems. The problem we included (middle problem^ column 
3, Tables 2a and 2b) deviates si ightly 'f rom independence (P(Ai/Bn) - 
P(A^/B2) ^ - .06, set 24, ^,03 set 36.) by the conditional probability rule. 
As a result we scored responses as correct if subjects concluded that A^/B^ 
va- either less likely or just as likely as A^/B2* The problem does discriminate 
appropriately between the other judgment rules, Cell-a and a^versus-b judges 
should say that A^^/B^ is mor^ likely than A^/B^, sum of diagonal judges should 
say the two outcomes are equally lil^ely. 
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Tables 2 

A) Cell frequencies used for problems in problem set 24. 



Cell a 
Problems 



^ versus b 
Problems 



Sum of Diagonal 
Problems 



Conditional 
Probability 
Problems 



(U) 


^1 


^2 


(6) 


h 


^2 


(2) 






(9) 




^2 




U 


2 




7 


3 




2 


2 




2 


' " 

12 


^2 


4 


7 


^2 


2 


12 


^2 


2 


18 


^2 


0 


10 



(3) 



(9) 



(7) B^ B^ 



(12) B^ B^ 



15 



(5) 




^2 


(4) 




^2 


(10) 


^1 


^2 


Ci) 


^1 


^2 


h 


2 


11 


^1 


4 


11 


^1 


8 


8 


^1 


12 


2 




7 




2 


8 


1 


^2 


8 


0 


. ^2 


10 


0 



B) Cell frequencies used. for problems in problem set 36. 



Cell a 
Problems 



vet sus _b 
Problems 

(6) 



Sum cf Diagonal 
Problems 

(2) Bj^ 



Conditional 
Probabilicy 
Problems 



(8) 



A. 


16 




^1 [ii 


2 




4 


4^ 




3', 


18 


A^ 


6 


10 


^2 LL 


16 


^2 


4 


<24 


A, 


0 


15 



(3) 



(9) B^ 



(7) B^ B^ 



(32) B^ 



\ 


9 


9! 




5 


5 






12 


9 




2 


7 




9 


A 


^2 


. 13 


13 




A ^ 


9 


6 


A 

2 


6 


21 



(5) 



(10) B^ 



94 



(i) B, 8, 





3 


16 




6 


18 




U 


a 


A. 


18 


3 


A. 


iO 


7 


A^ 


9 


3 


^2 


il 


3 


A, 


15 


0 
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Table 3 

Strategy Classification Criteria 

Strategy use and resultant patterns of problem accuracy, 
(+ = accurate > 0 ~ inaccurate) 



Subject 

Strategy 

Type 







Problem 


Strategy Type 




Cell 




Sum of 


Conditional 


t' 


a 


a versus b^ 


Diagonals 


Probability 


Conditional 


















Probability 










Sum of 
Diagonals 








0 


a versus ■ 




+ 


0 


0 


Cell a 


-1- ' 


0 


0 


0 


Strategy 0 


0 ^ 


0 


- 0 


0 



\ 
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Table 4 



i f labK- 
6 



yum of , Co^ndiClonai 
Stratv^jy 0' Ct-JJ-d o-vtrsus-b DLiniMials Probability 



0 
n 



16 
U 

J 8. 



40 
44 
71 



0 
0 
0 



N 
37 

17^ 
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Table 5 
Experiment 2 
Mean percent correct for each prqbiem type 



Problem Type 



Training 

condition Cell-a 

Attention-plus-more 83. 3 

At tent ion- only 55, A 

Control ^ 52,8 

All 63.8' 



a-vetsus-b 
80.6 

38.8 
54.6 



Sum of Diagonals 
8.3 
27.8 
44.4 
26.8 



Conditional 
Probabiiicy 

5.5 

33.3 

24.8 

2^ .2 



Ail 

I 

44.4 
40.2 
40.2 
41.6 
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Table 6 i 
iiffecty of a-veryi*t>-b tiainlnj^ on pusLtest performajice 



Control 

Ac ttintion 
Oiily 

Attention 
plus mor<? 

Total 



3 
3 

10 
16 



Didn' t 
Improve 

9 

' 9 



20 



Total 
12 
12 

12 

36 



Training for fls^roved Covariation Judgmenc 
Harriet Shaklee, Laurie Hall, and- Don Passek 
University of Icwa 



Paper presented to the Psychonoraics Society, November > 1982, Minneapolis* Minnesota 
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A variety of theorists have suggested that covariation judgment may 
\ 

be a key element in causal reasoning- That is, people may find likely 

causes of an event hy searching for cova,ri£ices of that event. If causal 

and co^'iriation judgment are interlink^id in this way> then accuracy of 

covariation judgment may set an upper limit to an individual's competence 

at cauSil reasoning. 

Evidence from our owp^ investigations indicates that people show \ 

wide individual differences in competence at covariation judgment. In 

particular, a m^jority'of adults eiiploy rules which may lead to better 

than chance accuracy, but which result in systematic errors on same ^ 

event relationships. We've focused our investigation on four strategies 

which might account for subjects' judgment patterns. Each of these 

strategies will be discussed in terms of the four cells of a 2 x 2 

contingency table, labeled cells a, b, c, and d in a left to right, top 

J* 

to bottom sequence. One commonly proposed strategy is "to judge a relationship 

according to the number of times the target event states co-occur, cell- 

a of -Che contingency table. We term this strategy the cell-a strategy* ^ 

A second approach might compare the nximber of times the target event 

occurs with its supposed cause with the number of times that event 

occurs without that possible cause. This strategy would compare frequencies 

in contingency table cells a and b, a strategy we call a^-versus-b, A 

third strategy might compare the number of events confirming a relationship 

of target event and supposed cause wich the number of events which would 

disconfirtip such a relationship* This strategy vo^ild compare the ^um of 

friaquencies in cells a and. d with that of cells b + c, a strategy we 

tenn sum of diagonals ((a + d) - b + c))* Finally, a manhematicallv 

sophisticated approach would compare the probability of target even^ 
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given the supposed cause with the probability of the eve.it when that 
cause was absent. We call this strategy the conditional probability' 
strategy and is the only one of our strategies which will alvays produce 
\correct judgments of eve^t covariation^* 

*Thusj we propose four different judgment rules varying in complexity 
and likely accuracy- Since different rules should produce different 
judgments J we c^ construct a problem eet where each* solution strategy 
produces a unique solution pattern- A sample of such problems is illustrated 
in Table la. Problems are structured hierarchically such that cell-a 
problems are accurately judged by all rules; a-versus-b problems should 
be -correctly judged by all but cell-a judges. .Sum-oi:**dia5cnals problems 
should be accurately judged by sum-of -diagonals and conditional-probability, 
problems should be accurately judged by tt^e conditional probability 
rule J alone* Accuracy of judgment is indexed by the direction of the 
judged relationship. For example, a-versus**b fudges should judge the 
conditional probability ptoblem in Tattle la as a case in which Aj^^is 
less likely given Bp than, given B2 (2-^12)- Sum of diagonals ^judges 
should judge the two events as unrelated (2 + 10 = 0 + 12) and conditional 
probability judges should see A^ as more likely given than given B2 
(2/2 vs* 12/22)* A subjects strategy is indexed by t;he accuracy pattern 
on a 12 problem set > including 3 pr^^blems of each of the problem strategy 
type* Table lb indicates judgment accuracy predicted by each of the 
proposed rules. Subjects who pass no problem types are labeled Strategy 
0. All other patterns not represented in tha tilble would be labeled 
unclassifiabie. We've looked ac rule use in this way in several experiments 
involving subjects from 4th ^rade through colltage age. Problems in 
these experiments are set in the context of concrete events which ,-;ould 



c 



lOx 



f 

be related. Frequency information Is represented in pictorial format in 
a 2 :c 2 cable, 3ubjeccs ars asked about che i;elative likelihood of an 

uuccome-Pl^iven che cwo alternative states of the other variables ► 

t 

Our past evidence indicates a strong developmental trend in the 4th 
grade to college age span* The modal strategy at 4th grad^was the a- 
versus-b rule, although Strategy 0 and unc^assif table judges were also 
common. The sum of diagonals rule was used by a substantial group of 
subjects in our 2th and 10th grade samples. The conditional probability - 
rule "^as used by a substantial minority of subjects in tenth grade and^ 
college. The cell-*a rule was rare^at all ages tested.* Thus, subjects 
used increasingly ^phisticated rules with increasing age. However, the 
optimal conditional probability rule was used t!y a minority of subjects 
even-at college age^ ' ^ 

Having discovered these developmental trends, our current efforts 
Are trying to account for chose trends. That is, what knowledge aifferences 
between these .age groups may be implicated in the differences in rule 
use* A common approach to the problem is to develop a training method * 
which is effective in eliciting use of more advanced rules. Contents of 
those effective interventions allow us to identify one sufficient account 
of naturally occurring .developmental trends. ' Effective training programs 
may also be of pragmatic value in improving covariation judgment . 

Our first concern was with the many fourth graders who didn't match 
any of our proposed rules. Given the number of such subjects, we have 
tj consider the possibility that these children were confused by some 
aspect of our method and were unable to demonstrate their true competencies ► 
Our approach vas to elaborate our instrucitions to insure that the children 
understood the tabled stimuli and to reformulate tSe covariation question 
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in a syntax- more appropriate for younger children. 

These modifications were m.ade to make our problems more comprehensible 
to younger children. It :urns out that we outdid ourselves in this 
respect* ^ Testing a new sample of children, nearly all of our subjects 
wej^e ^classifiable by one of our rules in the fourth gracfej and a majority 
of chiSidren showed systematic rule use in the/second and third grades* 
Overvlielmxnglyj these subjects were classified as using the a-versus-b 
rule. Uncliissif iable and Strategy 0 judgment patterns we;?e predominant 
among first and second grade children. As a result, this population was 
the target age for an attempt to elicit use of a simple judgment nile* 
Thus J the first experiment I'll describe is an attempt to train 7Vfiar 
old subjects -^o use the a-versus-b rule. We opted not to train children 
in use cff the cefll-a rule since it so rarely occurred naturally? 

Our training approach stemmed from our s.v^^picion that the judgment 
question itself focused children's attention on cells :? and b of the 
contingency table* Asked if plants are more likely t,o be healthy when 
they get bug spray or when they don* t get bug spray, a subject may look 
at those two event conjunctions (i.e, healthy plants-bug spray; healthy 
plants-no bug spray), We thought of this as a problem of attention 
direction. This was the reasoning behind our attention only condition, 
where, on a set of 6 training problems, the experimenter asked. the 
subject to point to the e^nt combinations specifically mentioned in the 
question and to count the number of cases in each of the tvo cells. 
Subjects then made their covariation judgmentt Subjects had mastered 
this technique by the end of the training, problems . 

A nsubject may also fail to use the a-versus-b rule because he or 
she misses the comparison aspect of the questiori i*e., which is more 
likely, A secona group of subjects were given the Attention instructions 



on the craining problems and, in adajltion, were specifically asked which 
of the cwo cells had more cases in it. Subjects then made their covariation 
judgm^ts. Subjects also mastered this technique by the end of the 
training problems* This group is the Attention-plus-More training group, 

A final group is a no training control group, who judged the same 6 
■problems but were given no special instructions* 

All subjects were pretested to establish initial rule use* Unclassifiable , 
Strategy 0 and cell-a judges were included in the. paradigm.. Subjects 
were randomly a^ssigned to one of the tnree conditions* 

Subject fatigue prevented an immediate posttest of training effects* 
However, all subjects did return a week later 'for a delayed posttest. 
Subject's performance at chat time is illustrated In Table 2 of your 
handout. As you can see, rates of improvsmeiit were at the same loif level 
for Attention-only and control subjects. This failure of Attention*only , 
instructions may imply that subjects at this agfe already know how to 
find Che relevant cells. However, the Attention-plus-More training did 
result in reliable improvement ac the delayed posttest* Thus, we see 
that the comparative aspect of the judgment may be a key obstacle to 
natural use of this simple rule by young subjects. 

Having discovered that young cjhildren could use this simple rule, 
we next attempted to elicit use of more advanced rules from older subjects. 
Our first approach was to train subjects to use the sum-of -diagonals 
strategy. This strategy is built on the notion that some ^vent combinations 
confirm a particular relationship between events and that soTKe combinations 
disconfinn that rule, for <=:cample, if bug, spray is ^ogd for plants, ve / 
should see many cases of healthy plants with bug spray and unhealthy 
plants without bug spray. Healthy plants without bug spray and unhealthy 
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planes^ with bug spray would be exceptions to the relationship. Sum-of- 

diagonals training taught subject^ that cells a + d were good examples 

of a positive relationship and that cells b + c were exceptions to the 

rule Subjects learned tha;: the reverse was true for negative relationships 

Subjects practiced pointing to the cells with good exatiples and those 

f 

with exceptions Co the rule oh each of 6 training problems. Subjects 
also counted Che number of cases in cells a + b and in cells b c for 
the training probleins. These subjects then made their covariation 
judgments* A group of control subjects made covariation judgments on 
the iiame probleais without the benefit of training. Training effects 
were measured in an immediate postte^t and in a delayed test one week 
later* Subjects in the experiment were 4th, 5th, 7th and 8th grade 
children who^e pretest performance showed use of cell-a and a-versus-b 
rules. 

The results of this training* experiment are sl^own in Table '3. Note 
chat unclassifiable posttest subjects were not- included in.^the ana:^ses. 
Trained subjects vere significantly more likely Co shi^w use of the sum- 
of-diagonals rule both at the immediate and at the delated posttest* 
Tt*is evidence indicates that subjects can indeed show improved rule qse 
with a relatively simple training procedure. These training procedures 
were similarly effective among the younger and older subjects in th^ 
sample. Our training in confirming and disconf irming cases not only 
yielded better accuracy, but those judgments also conformed to the 
pattern predicted by the sum-of "diagonals rule. This suggests that this 
reasoning may well underly the natural acquisition of this ji^ule in 
'-children's development; At a minimum, these training effects identify 

' if 

I 

one sufficient model of this developmental process. * 



Our final efforts atf training are looking at what it takea to 
elicit use of the optimal conditional proSability rule among junior high 
aged subjects. Thus far , it looks lik'= our training efforts are £:tV;cessful- 
This set of training studies suggests that subjects at all ages may show 
problems in covariation judgment but 'that those problems, are not irremediabl 
Our evidence suggests that relatively simple training efforts can elicit j 
use of more sophisticated and more iScurate judgment rules* 
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Table 1 

A) Sample covariation problems 



Cell a 
^'Problec 



a^ versus 
Problem 







«2 






^2 




11 






4 




^2 


I 


/8 


^2 

1 


3 


16 



^uii of Diagonal 
Proble^i 



Conditional 
Probability 
^Probl^ 



^1 




f ^ 




^2 










4. 


4 




2 


12' 




L5 




0 


LO 



B) Strategy use and result^t patterns of problem accuracy, 
^ (4*" » accurate* 0 " inaccurate) 



Problem Strategy Type. 







cm 


£ versus ^ 


Sum of 
Diagonals 


^ Conditional 
Probability 


0 




Conditional 
Probabilities 








+ 




.abject 


Sum of 1 
^ Diagonals 




■ ^ 


1 


0 




strategy 


a^ versus 






- 0 


0 




<-> 


Cell a 






0 








Strategy 0 ■ 


0 


0' , 


0 


0 
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^ Table 2 , 

Effects of a-versus-b Training on Delayed Posttest 
performance of 7 year old children 



ContTOl 



Ac tent ion 
Only 



Improved 
3 

3 



Didn't 
Improve 



Total 
12 

12 



r 



Attention 
pltis more 



10 



12 



Total 



10 



20 



36 



X - 11.02, df = 2. p^< .01. 



Table 3 

Effects of Sam-of-Diagonals Training 
on Immediate and Delayed Posttest 
performance oci 4th-3th grade children 

Immediate Posttest Delayed Posttest 





Improved 


Didn' c 
Improve 


Unclasslfiable 


Improved 


Didn't 
Improve 


Unciassifiable 


M 


"Cottcrol 


A 

i 


17 


2 


5 


14 


4 


23, 


Tralttlttg 


15 


6 


8 


21 


6 


2 


29 


Total 


19 


23 


10 


26 


20 


6 


52 




X = 9 


.6, df = 1, 


p < .01 


2 

X 


= 9.87, df 


= 1, p < .01 
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and Method of Information Presentation 



Edward A* Wasserman and Harriet Shaklee 
The University of Iowa 



Running head^ Judging Response-Outcome Relations ^ 
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Judging Response-Ouccome Relations 



Abstract 

A series of four experimencs investigated college students* judgmencs of 



interevent contingently* Subjects were asked to judge th* effect of a discrete 
response (tapping a wire) on the occurrence of a brief outcome (a radio's 
buzzing)* Pairings of the possible event-state combinations (response-out- 
come , response-no outcome, no response outcome, no response-no outcome) were 
presented in a summary table (Experiments 2 and A), in an unbroken time line 
(Experiments 1, 2, and A), or in a broken time line format (Experiment 3)- 
Subjects judged the extent to which the response caused the outcome or pre- 
vented it from occurring* Across all methods of information presentation, 
judgments were a positive function of response-outcome cofttingency and outcome 
probability* In the unbroken time line condition, judgments of negative 
response-outcome contingencies were less extreme than judgments of equivalent 
*^ositive contingencies. This asymmetry was smaller in the broken time line 
condition and in those conditions where subjects were encouraged to segment an 
unbroken time line into discrete response-outcome units^ Finally, judgments 
of positive and negative relationships were generally symmetrical in the 
suinmary table condition* Relative to the two time line portrayals, summary 
table judgments were also less influenced by the overall probability of oiit- 
come occurrence* These judgment differences among format condilions suggest 
that, depending on the method of information presentation, subjects differently 



partition avent sequences into discrete event pairings. The segmenting of 
continuous event streams may be an important factor in the accuracy of every- 
day judgments of interevent contingency. 
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And now remains, 

ThuC we find out the cause oi this effect,' \ " 

Or rather say chti cause of this defect. 
For this effect defective comes by cause, 

U, Shakespeare ; Hamlet , II » li y 

Students of behavior both before and after Shakespeare have been Interested 
Id causal perceptioo. Most noteworthy was D, Hutne (1739) who proposed a set 
of condicions which were conducive to cause^effect impressions, Hume's 
insights into the psychology of causation have helped to shape the direction 
or subsequent ie:5earch arvd theory in the area, 

AI30 important have been discussions of caysal perception from compara- 
tive and developmental perspectives, C, tforgan (l893j^l89A) concluded on 
the baeis of extremely limited evidence that human adults, but not children 
and animals, can perceive the relationship between events* More systematic 
data led lahelder and Piaget (1958) to propose a stagevise unfolding of the 
huiran^s conception of iaterevent correlation or contingency as the individual 
develops from child to adult. 

Subsequent, investigations into the perception of interevent relations 
have not yielded evidence that is consistently favorable to the developmental 
and evolutionaiy speculations of Morgan and of Inhelder and Piagec* Nor is 
the evidence p^irticulariy supportive of modem theories, which posit a 
virtual identify between humans^ and animals* perceptions and the actual 
interevent conr.ingeacies that prevail in their environments (e,g*j Heider,, 
1958; KelUy, 1967; Mackintosh, 197A; Rescorla^ 1978), 

In the basic human judgmt:nt paradigm, 3ut}iects are given information 
abouL the frequency of pairini;3 of alternative states (e,g*j presence and 
absence) of two events (e,g, j plant rood and plant health); they can then be 
asked to judge the direction <iind magnitude of nhe relationship between tfhe 
eventS4 In ma^^y of thesse expt^rimencs » adults do not accurately Judge the 
correlation between two binarv variables (see Crocker » 1981 for a review)* 
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Despite these negative results, other work has been more successful in 
showing that adults caa accurately judge interevent relations under some 
circumstances (e.g.* Allan & Jenkins, 1980; Alloy & Abramson, 1979; Seggie^ 
1975; Seggie & Endersby, 1972; Shaklee ^ Tucker, 1980). Nevertheless , many 
factors have been suggested over the past 20 years which loay contribute to 
distortions in the perception of correlation. 

Investigators have founc that the accuracy of correlational judgments 
depends on the sign of the relationship being judged. In particular, Erlick 
and Hills (196V) found that subjects judged negative correlations as closer to 
zero than positive correlations of equal magnitudes. Also common is the 
result that subjects find contingencies of zero to be especially difficult to 
identify. For example, Seggie (1975) reported that subjects were accurate in 
their judgments of contingent relationships, but were error-prone in Judging 
noncontingeOt relationships (also see Allan, 1980; Allan & Jenkins, 1980). 
Alloy and Abramson (1979) replicated this pattern of differential accuracy in 
nondepressed subjects, but found that depressed adults judged noncontiagent 
problem closer to zero than did nondepressed subjects. 

Oue must^ however, be cautious in interpreting the effects of relation- 
ship direction; subjects may approach the stimuli in question with strong 
expectations about the nature of the relationship that will hold. In Segglc's 
1975 study, for example, subjects judged whether or not hospitalizing a victim 
of a tropical disease would improve the chances of recover/* Erlick and 
Mills* (1967) subjects judged the relationship between the quantity of a 
particular food a person ace and whether the person felt better or worse. 
People who believe in the merits of medical science or hearty eating would be 
likely to expect each to improve general well being. This expectation could 
produce a bias to report relationships as positive, resulting in errors in 
judging negatively related and independent events- 
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Evidence of such an exPet.cation etfect vao found in the research o: 
Chapman and Chapman (1967 a b) > where subjtjcts Judged there co be a positive 
relationship between semantically-associated clinical signs and symptoms in 
sclmull that actually presented the sign and symptom as independent » or even 
negatively, related. This illusory correlation effect proved to be highly 
red!k6tant to a variety of attempts to reduce It, Including exposing subjects 
to the stimuli several times and offering them a $20 rewaid for accuracy. 
Similar expectancy effects may be a reason for some past findings of differen- 
tial accuracy aa a function of relationship direction. Any attempt to examine 
the effect of relationship direction should then be conducted in a context in 
which prior e^ectations sre minimal* 

A second common findinij in past research is thac judgments of interevent 
correlatluns are biased by the relative frequencies of the event strtes of the_ 
variables involved- For example, Jenkins and Ward C1955) asked subjects "how 
much control tKeir responses (pushing Button 1 or 2) had over the frequency 
with which a score light appeared- Subjects' judgments of control were most 
strongly correlated with the number of times the score light occurred^ regard- 
less of whether that outcome was actually influenced by their choice of buttons, 
Allan and Jenkins (1980) found that this bias was reduced^ but not eliminated 



when subjects had a single button to press or not to press, compared to Jenkins 
and Ward's cwo-button condition (also see Alloy & Abramson, 1979)- The findings 
of these investigations Indicate that the probability of the outcome is a * 



gency judgment, 

A final recurrent finding in past research is that the accuracy 'of judging 
Interevent contingency depends on how the event frequency Information Is 
presented. Two common formats present this information either as a series of 



second possible confound to be controlled or manipulated in assessing contin- 
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Ir^-^lviduai event-estate combinations (e.g.^ Alloy L Abramson^ 1979; Shaklee ^ 
MimSt 1982; Ward ^ Jenkins, 1965) or as a summary tab^e Ce.i*/> Seggie, 1975 ; 
Smedslundj 1963; Ward fit Jenkins, 1965). Experiments which have compared the 
two presentation fo^^mats have found accuracy to be higher when the frequency 
iaformation is summarized in table format* 

Of course^ the serial and ^summary formats differ in a variety of ways. 
Most obvious the added memory demiand involved in the trial-by-^trial presen- 
tation of information; thus* subjects who add strong memory load to an 
already complex judgment process may compromise accuracy to simplify an over- 
whelming task. Shaklee and Mims (1982) relied upon* such a memory account in 

* 

interpreting their judgment findings. Ward and Jenkib^ (1965) » however^ 
argued that » while important^ menory load cannot fully account for the judg- 
ment difference between serial and summary formatSj Rather, they proposed 
that the serial presentation of stimulus information may lead subjects to 
organize the information differently from those who view the same information 
in a tabled format:. Xn support of this pointy Ward and Jenkins note that 
subjects in their experiments who were shown tabled information after serial 
presentation used less appropriate judgment strategies than those who saw only 
the tabled information. If information is organized differently under the two 
conditions, then this may lead subjects to make different judgments of int^er- 
event relationships. Although this reasoning is plausible^ past paradigms 
have confounded presentatioti format with memory loadj the contributions of ^ 
memory and organization effe'cts in past research cannot then be separated* 
The issue is best addressed by comparing usti of serial and summary frequency 
information in conditions alike ^in memory load, 

The present study thus compared serial and summary formats in a setting 

\ 

free of memory demands, while also jsing a problem for which subjects should 
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have little bias as to the nature ot the interevent relation. The basic 
situation involved troubleshooting a maif^mctioning tradxo* While this situ- 
ation is far less dramatic than Polonius* efforts tc determine the reason for 
Haialet's odd behavioi, it is nonetheless representative of everyday instances 
of causal reasoning. 

Subjects were told that an individual was trying to find the cause of an 
interraittent buzz (B) by occasionally tapping (T) on a wire inside the radio. 
The results of the troubleshooting were then given to the subject, who was 
asked to judge the degree to which tapping affected the radio's buzzing: from 
**causas the scund to occur'* to "has no effect on the sound'* to "prevents the 
sound from occurring," This context has the virtue of being one in vrhich 
subjects should noc have a strong expectation about the nature of the response- 
outcome relationship; tapping a vrire should be as likely to complete as to 
break a loose connection* Similarly, if the wire is not loose, tapping it 
should have no effect on che buzz, 

Iding constant the probability of tapping, £(T) , both the probability 
of a buzz given a tap, £(B/T), and the probability of a buzz given no tap , 
£(B/T), were systematically, varied to yield 24 different troubleshooting 
conditions. These conditions in turn constituted nine tap-buzz contingencies, 
£(B/T>^ - £(B/T), ranging in .25-steps from -1,00 to +1.00 (see Allan, 1980 for 
further discussion of various measures of contingency or correlation). 

An additional feature of the 24 troubleshooting conditions was that they 
were contrived in such a Tay that they vari(»d not only in the tap-buzz contin- 
gency, but also in the overall probability per sampling interval of the 
buzzing sound, £(3), Eight different buzz probabilities were studied, ranging 
in ,125-steps from .125 to 1*000, Because the tap-buzz contingency and the 
relative frequency of the radio*s buzzing vs its not buzzing were independent 
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dimensions in the present exper^encal design, the contributions of thes^ 
variables to subjects* judgments of correlation could be individually assessed^ 

The method of inf onnacion presentation was studied with two basic cechni- ^ 
ques* In one, subjects were given summary tables showing the numbers of times, 
that the four possible event sequencer occurred in 2A sampling intervals: 
tap-buz2^ tap-no buzz, no tap-buzz, and no tap-no buzz* In the other, the 
same information was given in a time line format, with the 2A sampling inter- 
vals graphically and linearly arrayed* Such an arrangement preserves the 
sequential character of the critical events, while minimizing the strong 
meocory demands that are ordinarily placed on subjects when they are given 
information in a trial -by-trial fashion* This method was originally suggested 
by Ward and Jenkins (1965, p* 2A0); however, it has never been utilized in 
experimental research* 

Since past work has not entailed a time line presentaCion of event 
frequencies, our series of investigations began by looking at subjects' judg-* 
ments using this format alone* Ex:periment 1 explored the effects of tap^buzz 
contingency and bu^z probability on judgments of tap-buzz correlation in both 
within-subjects and between-^subjects paradigms* Experiment 2 dirsctly com^ 
pared the effects of the time line and summary table methods of information 
presentation* Because the second experiment disclosed that judgmencs did 
differ under the two conditions of information presentation. Experiments 3 and 
A explored possible reasons for the judgment differences* ^ 

Experiment 1 

The first experiment investigated the judgment of response-outcome corre- 
lation when responses and ouccomes were shown to subjects in a time line 
format* In one part of the ex-^eriment, each subject received only 1 of Ik 
possible tap-4>u22 conditions; in the other part, each subject received all 2A 
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tap-buzz conditions. Both betveeii- and within-subjects conditions were included 
^ in order to Identify possible influences of raulciple judgments , since we hoped 
to use the more efficient vithin-*subjects procedure in later work. Subjects' 
ratings of the response-outcome relationships allowed us to^determine , the 
degree to which the tap-rbuzz contingency^ £(B/T) -£(B/T), and the overall 
probability of the buzzing sound, £(B) , influenced their behavior. To deter- 
mine whether the sign of the response-outcome correlation affected subjects' 
judgments, equal numbers of positive and negative contingencies were studied* 
Method 

Subjects * The subject- were participants in an introductory psychology 
class, who served in the experiment as one. option for fulfilling a course 
requirement* A total of 552 students served in the between-subjects part of 
the experiment and a total of 25 students served in the within^-subjects part* 

Problems * A set of 2A problems was constructed* These problems were , 
alike in that they all comprised 2A sampling intervals* Each sampling iifter- 
val in turn had two components: a "response'' component during which a tap 
might or might not occur, and an **outcome" component during which a buzz might 
or might not occur. Each of the A8 resulting components of a problem was 
denoted on the subject's problem sheet as a dash; the A3 consecutive dashes 
thus constituted the time line for each problem. Taps in the response com- 
ponent of a sampling incerval vSre denoted by an "A" above the dashed time 
line, and buzzes in the outcome component of a sampling interval were denoted 
by a "B" below the dashed time line* 

" For all 24 problems, there were 12 taps represented in the possible 
response components. Thus > the probability of tapping per sampling interval, 
£(T) , was alvays .50* Problems varied in terms of the likelihood that a buzz 
was represented if>. the outcome components, £CB), and the likelihood of buzzes 
following taps, £CB/T) ^ and no Caps, £CB/T)^ in the response components* 



llo' 



ERIC 



Judging Response-Outcorae Relatlocis^ 

9 

For each of the 24 problems, Table 1 shows the numbers of sampling inter- 

*v j" i 

\ 

vais of each of four possible types: tap-buzz, tap-no buzz, no tap-buzz, and 
no tap-no buzz^ Note that the number of sampling Intervals with a tap is 
equal to 12, which is the same as the number of sampling inteirvais without a 
tap^ Note also that the total number of sampling Inteirvals equals 24* And 
note finally that the number ^f sampling intetvals with a buzz varies from 3 
to 24. 



Insert Table X about here 



For each problem, time lines were constructed from smaller groupings that 
contained eight sampling intervals* The sequence of event pairiugs was det;^r-* 
(Dined randomly within each el^ht-sampl^ group* Vlille eight-sampling groups 
theoretically provide all the necessary information that is needed to distin^ 
guish the 24 problems, we thought it advantageous to triple the amount of 
input given to the subjects in hopes that their judgments might thereby be 

improved* For example^ problem 18 In Table 1 was represented as follows: 

A A A A A A ^ A__ A A^A^ 

Bfl .B "BBBBB B B 

Figure 1 show3 a. second method of depicting the 24 probletos that were 
studied* Both the top and bottom portions of the figure locate each problem 
within the unit square defined by the two independent conditional probabilities, 
£CB/T) and£^(B/T). The top portion of the figure shows the response^outcome 
contingency £(B/T) * £(B/T) , of each of the problams; the bottom portion 
shows the iikelilijjod of the buzzing sound per sampling interval, £(B) , for the 
same problem set. There are nine response-^outcome contingencies and ei^ht 
probabilities of buzz pre'&entation represented by the 24 problems in Fi>;ure 1; 
Furthermore, these two procedural dimensions are orthogonal, as can be seen by 
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(ite slopes 



Che opposite slopes of the lines that connect the 2A problems in the top and 
bottom portions' of the figure. From the figure it can finally be seen that 
oae possible problem was aot iacluded in the set* When £(B/T) » 0 - £ (B/f ) , 
jo(B) - 0; little sense could tiius have been made of the task by the subjects 
(see next sect;ion for questionnaire instructions). 



" Insert Figure 1 about here 

Procedure * Subjects were given problem sheets that each contained 
instructionsj a time linej and a rating scale* The instructions read as 
follows : 

After buying atnev radio, Kim finds that it emits a brief 
buzzing sound every so often* Kim finds this buizing sound 
annoying and decides to find its cause* Removing JChe back 
of the radiOj Kim suspects that a wlre^may be loose* Kim 
chooaes a wire and taps on it a number of .times in order to 
see if this has any effect on the buzzing sound* In the 
diagram below, KimV. tapping on the wire is shown by an.; 
A above the time li:,e which moves from left-to-right across 
the page* An occur. ence of .the brief buzzing sound is 
shown by a below \ he time Tine* 

One of the 2A different time lines then followed. Below the time line was a 
nine-point rating scale ranging from -A (prevents sound from occurring) to 0 
(has no effect) to (causes sound to occur)* Subjects were asked to circle 
sine number that best corresponded to their answer to the question^ "If you 
were Ktmj what would you conclude was the effect of tapping on the wire?'* 

In the between-subjects i^art of the experiment, only 1 of the 24 problem 
sheets was given to each subject* ^n the within-subjects part of the experi- 
ment » each subject received all 24 problem sheets, with the oraer of the 
sheets randomly determined fo each subject* The 24 problem sheets were 
clipped together;' each packet also included the following cover sheet; 
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The aim of this experiment* is Co :;e*i hoj^ p&eple judge 
the relationship between their acti*:>ns and che consequi^aces 
of those actions, in che 24 sheets thaC follow, Che same ; 
basic problem is poa^d: whaC is che relation between Kim'i 
tapping on the wire of a malfunctionl^ng radio and the 
occurrence of a brief buzzing scKuid that Che radio Occasionally 
emits* iF^e 24 sheets differ only in the particular re^ivg ion ship 
between Kim's tapping and the occurrcmce of the sound* For 
each of the 24 sheets, please rate Che degree to <which Kim^s 
tapping affects the rate of the radii's buz^infe, 'from "prevents^ 
che soiind from occurring'* to '-pauses the sound to occur/' As 
you go through the 24 problems, you'll soon see that the problems 
differ from -one another to varying degrees* You may sometimes 
wane to look back to prior problems; you may eyen want Co change 
^ prior responses* This is OK* It is more imp<irtant to work 

through the problems carefully and methodically than^'to give 
quick and offhand reactions* Indeed, the materials are paper- 
clipped together so that you can sort through the mny sheets 
acM org^lnize them any way you wish* 

Results ' * \ * 

Table 2 shows Che means and standard deviacions of subjects' judgments 

for che 24 problems in boch Che beCween* and uichin-sabjects parts of Che 

experiment* Each of che 24 problems is located in the table'by the coor- 

dinates £(B/T) - £(B/T) and £(B) * In general, subjects' rating scores were 

positive functions of both £(8/T) - £(B/T) and £(B)^* 

: f 

\ in -ert Table 2 about here 



Figure 2 graphically pur. rays subjects' rating scores as separate func- 
tions of £(B/t) - £(B/T) and ^.(B) in each part of the experiment* Aiialysis of 
variance simultaneously ass:s ed the reliability of thjase two sets of functions, 



Ins< rt figure 2 about here ^ 



The left panel of Figure 2 displays subjects' ratings as a function of 
£(b/T) - £(b/T). The positi,v* diagonal in che figure shows the responses of a 
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hypothetical judge yliose responses corresponds a linear fashioa to £hfe 
aacuai response-butcotne concingencies and who also empljyb the full rating 
scale. 1x4 the -betweun- and wi thin-subjects^ parts of th^; experiment , .subl ects ' 
judgments were reliable linear funccion;* of £(B/T) - £(3/T) , £(1, 528) ^ 
139,17, £ < *001, and fCI, 24)"^=* 74,76, £ < .001, respectively; however, the 
slopes of those functions were clearly less than that of our hypothetical ^ ^ 
linear observer. The between- and wic^in-subjects functions also had reliable 
quadratic components, fCI, 528) - 11,28, £ - ,001, and fCI, 24) ^ 28.07," ^ 
£ < *001, respectively; this trend appears to be due to the negative segments 
of the functions having shallower slopes than the positive segments* Finally, 
in the within-sabjects part oi the experiment, the contingency-rating function 
had a reliable cubic component, F^Ci, 24) = 10.96, £ = ,003; this tiend appears 
to be due to the function having an inverted shape* Although the overall 
form of the between^-subjecns tunction was similar, it did not have a reliable 
cubic component,' i 

The right panel of Figiirti 2 displays subjects' ratings a,s a functiQD of 

". J- 

£(B)+ In the within-subjects part of the experiment, ratings were a positive , 
linear function of £(3)', £(1, 24) « 32,63, £< *D01. In the between-subjects 
part of the experiment, the linear trend only approached significance, F_(l, 
528) » 2.90, £ - .089- 

To assess the rejLative contributions of £(B/T) * £(B/T) and £(B) to sub- 
jects' Judgment scores, the percentage of problem variance accounted for by 
these factors was determined through the cubic component of each; beyond the 
cubic component, no significant variance"^ remained for either part of the 
experiment* In the betwe^n-subjeccs part of the experiment:, £(B/T) - £(B/T) 
accounted for 86*47% of the total variance and £(B) accounted for 3,21K; in 
the within-subjects part of the experiment, the corresponding scores were 
- 71,872 and 24,10%. 
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Discussion 



Subjects ' judgmen o£ contingencies in che time Line format showed 
several interesting \ren^4 that were generally comparable in the within- arid 
between-subjects parts o£ the experime These results also accord well with 
past paradigms using different presentation formats^ Firsts judgments of 
response^'outcome correlation were a reliable function of the contingency 
between the tapping of a wire and the occurrence of a brief buzzing sound. 
Subjects' latings ^^se as the tap--buzz contingency^ £(B/T) - £(B/T), Increased 
from negative to positive values. Thus, subjects clearly showed some sophis- 
tication about appropriate bases o£ contingency judgment. 

The relative accuracy of subjects' ^^udgments is, however, another issue. 
Mean judgments indicated tha.t subjects rated noncontingent relationships close 
to zero, but ratings of several negative relationships hovered close to zero 
as well. While subjects were asked to rate both the degree and the sign of a 
correlation, the clearest evidence o£ accuracy here was the rated direction of 
the relationship* Subjects* judgments should also have been ordered according 
to the strength of the correlation* While this was generally true> the ratings 
yielded contingency judgments that were poorer than ideal* Indeed, the quad- 
ratic component of the judgment function indicates that subjects did not treat 
positive and negative rejatiouships symmetrically; contingencies of the saiae 
absolute value were rated as stronger for positive than for negative rela- 
tionships. The fom of this difference in ratings of relationship strength 
closely tesemblGs that found ^n prior research by Erlick and Mills (1967). 

Th-^ second main finding \/as that judgments of correlation were reliably 
influenced by the likelihood of the buzzing sound, £(B) - This bias is com- 
parable to that found in oth^i' studies in which the judgment of contingency 
depended on the likelihood that the outcome occurred (Allan & Jenkins^ 1980; 
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Alloy £i Abramsonj 1979* Jenkins £■ Ward^ 1965). These prior studies most con-* 
vlncingly demonstrated a bias effect of £(B) with response-outcome contin- 
gencies of zero; Allan and Jenkins* (1980) Investigation further suggested 
that the bias effect could arise under positive contingencies. The present 
report confirms the above trends and also shovs that the effect of £(B) on 
Judgments holds under negative response-outcome contingencies as well (see 
that ratings tend to inr.reast; from top to bottom within most columns of Table 
2). 

Experiment 2 

The results of the time line portrayals in Experiment 1 were comparable 
in many ways to those of past paradigms. However, subjects who view informa- 
tion in a particular format may treat the information in a manner specific to 
that format; that is, * subjects * attention to information may depends on the way 
the ioformation is presented. The organization or integration of attended 
information may vary with stimulus format as well. We propose three ways in 
which the time line and the more familiar eiimmary table format may produce 
different judgments. 

Firstj tabled presentation of event frequency information offers the 
subjects tallies of the frequencies of each type of ev^nt-state combination. 
Our time line presentation (like past serial presentation techn^Lques) requires 
the subjects to generate such tallies* on their own. Subjects given time line 
information may guess rather than count those frequencies » resulting in esiti- 
matlon errors. This logic suggests that ju4^ent3 witH time line presentation 
will be generally less acr.urate than judgments with tabled presentation and 
th,it such differential accuracy will be relatively cotjstant across positive, 
negative, and noncontingent relationships* The resultant judgment function 
should be relatively flat across all contingencies compared to that of tabled 
tnxormation . 

124 
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A second possible source of difference is the fact that the Nummary cable 
presents the event^scace combinations in a form of comparable salience. In 
contrast^ each type of event pairing has a unique representation in the time 
line fonaat Ci^e,, AB, -B, — ), As a result, sq^ types of event pairings 
may be cjore salient thaa others. In particular, the interval pairs with two 
ev3nt absences ( — ) may be less prominent than those with one or both events 
present* This featrre may also have been true of past serial presenratioo 
paradigms* If so, subjects should underestimate the frequency of no tap-oo 
buzz pairings* Since the denominator of the conditional probability, £(B/T) , 
would then be smaller than would be accurate, this would result in an estimate 
of £^(B/T) that is too high* This lo turn should result in a bias to judge 
cootiogencles as being more negative In the time line format thao the same 
contingencies presented in the tabled format* 

Finally, the time line fo^rmat allows the subject to determine the delay 
between tap and buzz that will be counted as a tap-buzs: pairing- Consider the 
interval series A — B- The tabled format would represent this as one occurrence 
of tap--no buz2 and one of no tap-buzz* However, a subject given the time line 
presentation may well consider this series to be a single pairing of tap-buzz. 
This tendency would lead to an underestimation of the frequencies of event 
pairings tap-no buzz and no tap*-buzz and an overestlmatlon of the frequency of 
tap-buzi; pairings* These errors woul^ yield an inflated numerator for £(B/T) 
and a smaller than accurate numerator for £(B/T)- These biases should 
result in judgments of contingencies being more positive in the time line than 
in the summary table format. This problem of event segmenting should not have 
been true of past discrete trial presentations, where each slide or card 
defined an event-outcome pairing. However, the problem may be true of event 
processing; in real time, when ivent continua m'ist be defined as discrete 
events.* 
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Thus, each of three rt;asons for Judgment Jifferences in the two InformaT 



tion PresentaCiOn conditions would result in a unique pattern of judgment 



outcomes* Whetht-r any of these differences will materialize is an empirical 



question. Experiment 2 addressed this Issue by comparing judgments under the 



time line format employed in Experiment 1 with judgments of the saoe protlfeos 



presented m the summary table formt used in past investigations Ce*g*, 
Soedsiund^ 1963; Ward & Jenkins, 1965)* Since judgments were so comparable in 
the between- and wlthln-subjects parts of Experiment subjects in Experiment 
2 judged all 24 problems, ^ 



Subjects * The subjects were 3A undergraduate research participants* 
Problems * Xtie same 24 problems were used here in Experiment 1 + 
Problems in the time line fonnat were typed on a single sheet of paper with 
the nine-point r<itlng scal^e to the right of each problem* Problems in thd 
sumniary table format were typed on another sheet of paper similar to Table 1, 
except that the four types of sampling intervals were vertically arrayed; 
identical rating scales were located beneath each problem* Problems were 
presented in a s:f.ngle random sequence for the time line format and In a 
different random sequence for the table formac* 

Procedure . During the first portion of the experimental session^ sub- 
jects were givan an Instruction sheet describing the troubleshooting problems ^ 
on the attached sheet of paper* For half of the subjects the problems were In 
the time line format » and for the other half the probletns were in the summary 
taMe fonnat* During the second half of the session, subjects tforked problems 
In the format nor worked In the first half. Instructions for time line prob*- 
lems were the same as those used in Experiment 1* Instructions for summary 
table problems were th^ same » with appropriate adjustments co introduce the 
table rather than the time li'ie format* * 



Method 



Ric 




Judging Response-Outcome Relations 
17 

Results 

Table 3 shows the meaas and standard deviations of subjects' judgments 
for the 2^ problt^ms given in the time line and smmnary table fomsts. Because 
analysis of variance failed to disclose any reliable effects attributable to 
the order of foriHiat presentation, this factor is not considered in Table 3 nor 
in later data analysis^ As in Experiment 1, subjects' ratings were positive 
functions of both 2(B/T) * £(B/T) and 2(B). 



Insert Table 3 about here 



Figure 3 graphically depicts subjects* rating scores as separate func^ 
tions of 2(B/T) - 2(B/T) and £(3) for each method of information presentation. 
Analysis of variance simultaneously comp/red these two sets of functions ► 



Insert Figure 3 about here 



The left panel of Figure 3 portrays subjects* ratings as a function of 
2(B/T) - £(B/tV Overall, ratings were reliable linear, F(l, 32) - 51.72, 
2 < .001, and quadratic, 1^(1, 32) * 12.90, £ ,001, functions of tap-buzz 
contingency. Additionally, there w^s a reliable quadratic contingency by 
format interaction, F^(l, 32) - 4.97, £ » ,033* To pinpoint the source of this 
interaction, separate analyses of variance were conducted on the time line and 
summary table data* For both the time line and the summary table formats, 
ratings were reliable linear functions of contingency, _F(1 , 33) = 36*77, 
£ < *001, and F(l , 33) =* 44*27, £ < .001 ^ respectively* However, the quad- 
ratic trend was reliable for the time line format only, £(1, 33) = 14*59 > £ " 
*001* Thus, subjects' judgments were reliable linear functions of response- 
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outcome contingency with both methods of infonnaiion presentation; however, 
the method of information presentation influenced those functions, with the 
tabled format supporting judgments that better approximated those of an ideal 
observer, particularly in the region of negative contingencies. 

The right panel of Figure 3 illustrates subjects' ratings as a function 
of £(B). Overall, ratings were reliable linear, F(l, 32) » 30*11, £< .001, 
and quadratic, F(l, 32) = 26*68, £< *001, functions of outcome probability. 
Additionally, there were reliable linear, F(l, 32) = 6*32, £ = *017, and 
quadratic, F(l, 32) ^ 12.99, £< *001, outcome probability by format inter- 
actions* Because of these interactions, follow-up analyses were separately 
performed on the time line and summary table data* For the time line data, 
ratings were reliable linear, F(l, 33) - 34*57, £< .001, and quadratic , Fd , 
33) - 30*43, £ < *001, functions of £(B) ; for the summary table data, the 
linear trend was reliable, F(l, 33) 5,33, £= .027, and the quadratic trend 
fell just short of statistical significance, F(l, 33) = 3*69, £ = *063* Thus, 
the method of infonaation presentation altered the influence of outcome proba- 
bility on subjects* ratings; providing the information in a time line format 
both steepened the probability-judgment function and increased its curvature 
relative to providing the same information in a summary table format* 

And, regardless of tap*buzz contingency and buzz probability, judgments 
were reliably higher in the time line condition than in the summary table 
condition, F(l, 32) = 5,03, £ - *032, 

To assess the relative contributions of response-outcome contingency and 
outcome probability co subjects* ratings, the percentage of problem variance 
accoimted for by each factor was determined as in Experiment 1. For the 
aummary Cable data, £(B/T) - £(B/T) accounted for 81,35J: of the total variance 
and £(B) accounted ?or 12*58%; for the time line data> the corresponding 
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scori^ were 39.AS% and 51*79%. Beyond the cubic component, no significant 
variance remained for the summary table data*_ For the time line data, the 
8*78% remaining variance was small, but statistically significant, F^(17, 561) 
« 3.23, 2. < ^001* 
Discussion 

The data from subjects given the time line in this escperiment replicate 
the judgment patterns of subjects in the comparable condition of Experiment 1* 
In addition, the results of Experiment 2 confirm prior findings (Shaklee & 
Mims, 1982; Smedslund, 1963; Ward fii Jenkins, 1965) that the method of infor- 
mation presentation affects subjects* judgments of response-outcome correla- 
tion. 

The obtained judgtaent differences under two conditions comparable in 
memory demands suggest that past effects cf presentation conditions may not be 
solely attributed to memory* In general, subjects* judgments were more 
closely tittuned to response-outcome contingency when information was given in 
the summary table than when the same Information was given in the time line* 
First, the contingency-judgment function (left panel of Figure 3) was more 
syimnetrical about zero in the summary table condition, suggesting that sub- 
jects rated positive and negative relationships in a comparable fashion* 
Again, the tiiae line portrayal suppoirted less accurate judgments of negative 
than positive contingencies* Second, table format judgments were less dis- 
torted by the probability of nhe buzzing sound (right panel of Figure 3). The 
linear contingency by format interaction showed that the time line judgments 
were steeper ^functions of £(B) than the summary table judgments* 

We previously reviewed three reasons "why time line and summary table 
formats may result in different contingency judgments* The suggestion that 
the time line will lead to cnore errors in estimating frequencies of event 
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pairings chan the summary table predicted overall poorer contingency judgment 
accuracy {l+e+, a flatter^ buc symmetrical contingency-judgment function) in 
the time line than in the tabled format condition- The possibility that joint 
event absences (no tap-no buzz) were less salient in^e time line than in the 
tabled presentation mode predicted a general bias to report relationships as 
mora negative in the time line than in the summary table format. However, 
nelth&r of these difference patterns dtiscriba our results* j 

Subjects In this experiment did show a tendency to judge relationships as 
more positive in the time line than in the summary table condition. This 
result supports our third proposed source of differences, that subjects may 
group event pairings differently In the time line Chan the tabled format* In 
particular, event series A—B could be Identified as a single tap-buzz occurrence 
rather than a tap-no buzz and a no rap-buzz, yielding in a bias to report 
relationships as positive* However, we should note chat while ratings were 
generally higher In the time line than In the summary table condition, the 
positivlty bias was more pronounced for negative than positive contingencies. 
One possible account for this finding Involves Che influence of context on the 
grouping of event pairings; that is. A*— B may be most likely to be judged a 
tap*bu2Z occurrence when there are f&a contiguous AB pairings In the time 
line, as would be the case In negative contingencies* 

Besides helping us to understand why different presentation formats sup- 
port different judgments, t^ese performance differences between groups also 
allow us to reject the possibility that time line subjects' problems with 
rating negative contingencies are due to a response bias or to prior expecta- 
tions* Any expectation about the effect of tapping on the radio's buzzing 
should be the same in the two groups, but judgments of negative coritlngencles 
were distorted for tinse line subjects only. Cinillarly, since subjects made 
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judgments oa the same rating scale in the two condltlj^ns, performance dif- 
ferences cannot be attributed Co peculiarities in Che scale itself. 

Experiment 3 

rhe results chus far suggest thac subjects may define events differently 
in the time liae and table formats. If this is the principal reason for the 
inaccurate responses of time line subjects, then their judgments should 
improve when Che continuous stream of events in Che time line is separated 
into discrete units. 

Our third experiment further explored the problem of defining ^Individual 
sampling periods by placing a clear break between paired intervals in the time 
line format* To do cl^is, we simply added a blank space between successive 
sampling intervals along the time line* As in the withln-subjects part of 
Experiment 1» subjects rated all 2A tap-buzz contingencies. These judgments 
were compared to those obtained in Experiment 1, in whic^i successiVf^ sampling 
intervals immediately followed one another. 
Method 

Subjer.ts . Another group of 25 undergraduate research participants joined 
the 25 who had served pi Che wichin^'subjeccs part of Experiment 1, and whose 
data are depicted again in the Results section Chat follows* Subjects in 
these two groups were from the same introductory psychology course ^J^d were 
tesCed wichin 3 weeks oi the same school term* 

Problems . Xl^e problems for the new subjects were identical to ^those'in 
Experiment 1, excepc that one blank space was irwdrCed between successive 
sampling intervals along the time line, this format is illustrated in ^ 

sample item (Problem 11) ; 

A A A^A A -AA^^ A A_A 

B B B B "'^ B B B'"3 *B B ~ B *B 6 "B 
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Procedure . The procedur:^ for the new sub/jects given the broken time 
lines was identical to that for the former subjects given the unbroken time 
lines in Experiment 1, 
Results 

Table A s-iows the means and standard deviations of subjects' judgments 
for the 2A problems given in the broken and the unbroken time line conditions 
of Experiment 3* Again, subjects' ratings were positive functions of £(B/T) 
£(B/T) and £(E) . ^ 



Insert Table 4 about here 



Figure A graphically illustrates subjects' rating scores as separate 
functions of £(B/T) - £(B/T) and £CB) for each time line condition* Analyi'is 
of variance simultaneously compared these two sets of functions. 



Insert Figure 4 about here 



The left Panel of Figure 4 shows subjects' ratings as a function of 
£(B/T) - £(B/T). Overall, ratings were reliable linear, F(l, 576) ^ 542.75, 
£ < *001, quadratic, F(l, 576) - 34*32, £< .001, and cubic, F(l, 576) - 
20*35, £< .001, functions of tap-buzz contingency* Additionally, there was a 
reliable linear contingency by time line interaction, JF(1, 576) ■ 5.08, £■ 
*025, and a near significant quadratic contingency by time line interaction, 
£(1, 576) " 3*18, £- .075* Therefore, separate analyses of variance were^ 
conducted on the data for the group given- the broken time line and for the 
group given the unbroken time line. For both the broken and unbroken time 
line groups, ratings were reliable, linear functions, FCl^ 24) « 83*74, £ ^ *001^ 
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and fCI, 2A) » 7^.76, £ < ,001, respectively; quadratic functions, F(I, 2A) » 
7,17, £ - *013, and F(1, 2A) = 29-07, £ < .001, respectively; and cubic func- 
tions of contingefncy, FU* 24) - 24,83, £ < ^01, and 1(1, 24) « 10*96, £ « 

*003, respectively* Thus, although the contingency-rating functions were 

*x * 

similar, judgBients of contingency were more strongly differentiated for sub- 
jects In the broken tiiue line group; this greater differentiation was generally 
re notable for negative than for positive contingencies* ^ 

The right panel of Figure A portrays subjects' ratings as a function" of 
£CB)* Overall, ratings were reliable linear, F(l, 576) ^ 139.87, £ < *^001, 
and quadratic, JF(1, 576) =* 25*33, £ < *001, functions of outcome probability* 
Additionally, there was a reliable quadratic outcome probability by time line 
interaction, F(l, 576) - 6^18, £» *013* Separate analyses of variance were 
therefore conducted on the data from the two time line groups* For both the 
group given the broken time line and thfe group given the unbroken time line, 
ratings were reliable linear functions of J&(B), FCl, 2A) - 20*62, £ < -001, 
and FCl, 2A) * 32*63, £< *001, respectively* However, the quadratic trend 
was reliable for the broken time line group only, FM, 2A) « 2A,01, £ < *001* 
Thus, the probability-rating functions of the two time line groups were 
similarly sloped, although the function for the broken time line appeared to 
turn downward at high outcome probabilities more than the function for the 
unbroken time line. 

To assess the relative contributions of response-outcome contingency and 
outcome proi>ability to subjects' judgmetits, the percentage of variance accounted 
for by each factor was determined as in Experiments 1 and 2* For the broken 
time line group, £CB/T)"- £CB/T) accounted for 77.312 of the total problem 
Variance and. £(B) accounted for 19,08%; for the unbroken time line group, the 
corresponding scores were 71.37X and 2A,10%, 
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Discuses ion 

We introduced the broken time line format in Experiment 3 to partition 
the time line continuum' int:^ discrete sampling Intervals, The resu lts of the 
experiment indicate that this manipulation had an effecf on judgments of the 
problem set. Subjects judging broken ti^e lines showed greater differenti- 

/ 

ation in their ratings as a function oi the . scheduled contingency Chan sub- 
jects judging unbroken time Iz^nes* "*This increased differentiation was generally 
more prominent for negative than for positive relationships, a difference 
which was also true of subjects judging tabled information in Experiment 2* 

Thus* the results of subjects who viewed the broken time Jlnes duplicate 
in some respects the behavior of subjects judging on the basis of tabled 
information. Our ability to increase the accuracy of contingency juitgments 
by this manipulation enhances confidence in ou^ interpretation that subjects 
made errors in identifying discrete event pairings in the continuotis, time 
lines. The similarity of judgments of tabled and broken time line information 
suggests that one function of the table may be to separate a stream of events 
into coherent units* Such units may be more readily classed according t6 the 

/ 

c:/pe of event pairing and thus may be more accurately incorporated into a 
contingency judgment* 

While breaking the flow cf the time line into discrete samplii^g intervals 
yielded judgments more similar to those made with summary table presentation, 
inspection of Figures 3 and 4 shows that the judgments obtained under these 
two conditions were not identical* Contingency^-judgment functions under the 
broken time line format were less symmetrical about zero than under the 
suimuary table format, and probabilicy-'rating functions were steeper in the 
former condition than in the latter, rhus> other factors may well contribute 
to the dif f erences^'in contingency judgments obtained *rfith the time line and 
fiunanary table'^Tftrcmats in Experiment 2, 
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Experiment A 

Thus far, our leading interpretation of the problems created 1>y a con- 
tinuous representation of events is that people have difficulty breaking the 
stream into discrete units* An alternative approach to testing this account 
might be to teach people to pa^se the time'^llne into the compon(>nc units^ If 
such training produces judgment functions like those found in our broken time 
line and table formats, such findings wouldVfurther support this as/the source 
of judgment differences* A second function of the table mentioned earlier 
might be to offer subjects numerical summaries of the information about tha 
four event combinations* This summarized Information may be more readily 
incorporated into a decision rule in judginjg event covariations* In this way*, 
judgment accuracy might be further enhanced if. subjects were asked to count 
the occurrences of each event-^atate combination and note these fretiuencies 
in a Cable, By this process, subjects would effectively convert a time line 
Into a table format* . 

Our fourth and f iaal experiment used each of these approaches. One group 
of subjects was presented with the 24 problems In our original time line 
format, but were taught to b^eak the line into response-outcome intervals 
(line-interval), A secorfd group received these instructions and were also 
asked to count the frequencies of each event"^state pairing and write those 
frequencies in a table (line-table). Time line and table groups using our 
ori&inal instructions served a^ comparison conditions for these manipulations* 
Improved judgment by line-interval subjects compared to time line subjects 
would further implicate line segmenting as a factor in contingency judgment. 
Further improvement^ by line-table subjects would sugge4*t that summary infor- 
Kfl*tlon is also c.,, ''.mportant function of the tabled format, Because^-we found 
sex differences In contingency judgment in related work of ours (Shaklee & 
HaII, in press), sex was included as a factor in this experiment, 
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Method 

Subjects ■ A total oi 160 introductory p:sychology subjects served in the 
experiment wiDh 20 males and 20 females in each of four judgment conditions* 

Problems . The 24 contingencies for "this experiment were the same as 
those- in the previous experiments* Format of problems in the time line and 
table representations was the same as that used in Experiments 1 and 2* 

Procedure , "^e introduction to the troubleshooting problems was identi" 
cal to that used in the previous studies, except that the problem repres^ta- 
tion was explained in .one of four ways:S 

Line: These instructions ware the same-as those used in Experiments 1 
and 2 * - ^ ' 

Line-Interval: These problems were represented in a time line like that 
used in Expennents 1 and 2, but in this case subjects were specifically 
instructed how to break the time line intc^ response-outcome intervals. In- 
structions were as follows; 

Each dash on the time line represents one unit of time* 
^Ime units come in pairs, with the first an opportunity 
for a response (Tap or No Tap) and the second an opportunity 
for an outcome (Buzz or No Buzz)* Thus, pairs of successive 
intervals can be of four types; Tap-Buzz, Tap-No Bu2Z, No 
Tap-Buzz, No Tap-No Buzz* For each of the time lines, please 
rate the -degree to which^Kim*s tapping affects tha rate of ■ 
the radio's buzzing, from "pr^v^nts the sound from occurring" 
to "causes the sound to occur* 

Line'-Table: ^Problems and instructions were identical to those in the 

Line-Interval condition, except that each' problem was accompanied by a blank 

table labeled as in the l*tevious table "condition of Experiment 2. Subjects 

were instructed to complete the table before making their judgment* Instruc" 

tions were as follows; ^ " ^ 

- Each dash on the time line represents one unit of time. 
Time units comef in pairs, with thel first an opportunity for 
a response (Tap or No Tap) and the second an opportunity for 
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an outcome (Buzz or No Buzz). Thus, pairs of successive 
Intervals can be of four types; Tap-SuzZj Tap-No Buzz, Mo 
Tap-BuzZj No Tap-No Buzz. For each time line, please count 
jjie frequency of each of these four types of Interval pairs. 
Enter tho^e frequencies In the table to the right o£ the time 
line. Once you have completed the table^ please rate the 
degree to which Kim's tapping affects the rate of the radio's 
buzzing J from "prevents the sound from occurring** to "causes 
the sound to occur." 

Table: Problems and instructions In this condition were identic-\l to 
those in Experliaent 2. 

In each condition, the Information offered lu the instructions was shown 
on a sample problem Illustrating each type of response^outcome pairing. 
Subjects were Invited to ask any questions they might have> after which they 
proceeded at their own pace through the problem set, 

Results 

Means and standard devlatloi^s of subjects* j^ud^oents for the 24 problems 
in each judgment condition are shown in Table 5. Figure 5 Illustrates sub- 
jects' judgments of the nine contingencieSj £(B/T) - £(B/T) » and the elg^it 
probabilities of buzzing sound » > for the four Judgment conditions ► These 
functions were simultaneously compared by analysis of variance » including sex 
of subject and jiifigment condition as factors. Paired follow-up analyses were 
conducted on interactions, setting alpha at .025 to reduce the experlment-wlde 
error rate. 



Insert Table S and Figure S about here 



The overall analysis yielded reliable linear, F(l, 152) " 351,36, £< ,001. 
quadratic F(l , 152) - 100.92, £ < *001, and cubic F(l, 152) * 12*52, < ,001 
trends of response-outcome contingency on subjects^' judgments. As In our 
previous e^cperlments , ji. ments were a function of probl.^m contingency, but 
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with judgmeats of negative relations closer to sero than chose of positive 
relations. This analysis also showed a main effect of judgment condition, 
F(3, 152) - 11*^0, £< *0OI^ although that effect is qualified by a contin- 
gency by condition interaction, F^(23, 3496) * 2*47, £< ♦001* As s^en in the 
left portion of Figure 5, the form of this interacti*;n shows that judgments in 
the Table condition were most symm'atrical about zero, judgmeats in the Line 
condition were least symmetrical, and judgments in the Line*IatervaI and Line- 
Table conditions fell between tliese two extremes* Follow*up analyses compared 
contingency judgment functions for selected condition pairs* Line-Interval 
and Line conditions were compared to identify the effect of the interval 
segmenting instmcttons* This analysis showed Line-Interval subjects to be 
significantly different from Line subjects; linear trend ^(1, 76) " 11.12, £ » 
+ 001, the quadratic trend approaching significance X^^' '^^^ " 4.92, £ +029 + 
Comparison of Line-Table and Line-Interval ccntingency functions showed that 
tabling the frequency information had no additional effect on judgment accuracy + 
Line-Table and Table judges were compared to see if judges who tabled the 
frequency information for themselves were equivalent in judgment to those who 
Judged tables provided by the experimenter* This comparison shewed that 
contingency judgment functions were not equivalent for the two groups, with 
Line-Table and Table judges reliably different in quadratic trend, F(l> 76) » 
5 + 83, £ =^ -018, but not in linear or cubic trends. 

Sex differences in contingency functions were statistically significant, 
with the contiL^tfhcy-judgment function for females flatter than that for 
males: liaear trend F (l , 152) - 3 + 9^, £ - +049, cubic trend F(l, 152) = 4,38, 
£ " +0J8+ This sex affect did not interact significantly with judgment condition. 

As in our prc*/ious experiments, subjects' judgments W>ire an increasing 
function of the probability of the buzzing sound (see right portion of Figure 
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5). Ratings showed sisni£icant linear, 7(1, 152) » 210.66, ^< .001, quad- 
ratic, F(l, 152) " 30*90, £ < .001, and cubic, F(l, 152) » A. 53, £ = .034, 
trends as a function of £(B), Unlike previous analyses, however, these 
probability-judgment functions were not reliably affected by judgment condi- 
tion, :*lthough the Line group again showed the greatest effect of. £(B) and the 
Table group showed the least effect* Effects of £(B) also did not differ as a 
function of subjects' sex. 

The relative contributions of response-iutcome contingency and outcome 
probability in each of the four conditions were detenmined as in the prior 
e:£perlments* For the Table group, £(B/T) - £(B/T) accounted for 89*07% of th£ 
total problem variance and £(B) accounted for 9*47%; for the Line-Table group, 
the corresponding scores were_30*97% and 17*02%; for the Line-In tie rval group, 
the scores were 76.0AZ ano 17*61%; and for the Line group, rhe scores were 
71*38Z and 22.64%, In only the latter two groups was the residual variance 
significant; Line-Interval residual * 6,35%, F(17, 646) « 6*72, £ < ,001, and 
Une residual - 5,98%, F(17, 646) - 2*25, £ - *003* 

^ince frequency judgment errors *nay detract from contingency judgment 
accuracy, the frequency tables generated by subjects in the Line-Table condi57 
tion were examined for accuracy* Overall, errors were suall, with mean 
absolute deviations of #15, *10, ,30, and 1*65 for Tap-Buzz, Tap-No Buzz, No 
Tap-Buzz, and No Tap-No Buzz frequencies, respectively* In view of the dif- 
ferential judgments of positive and negative relationships in this condition, 
frequency judgment accuracy waa compared for problems representirg positive 
and negative contingencies. Absolute deviations were averaged across table 
cells for this analysis. A matched-pairs ^-test showed no reliable difference5 
in frequency judgment errors on positive and negative contingencies, ti'i9) < 1, 

Discussion 

Experiment 4 represents a conceptual replication o£ our third experiment. 
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In Experiment 3, [we broke the time line into discrete units* In this experi- 

I 

ment, we taught jihe subjects themselves to define these intervals. The results 
indicate that thii ujnipulations in the two experiments had similar effects. 
Une^Intervai an<i Line-Table subjects in Experiment 4 produced contingency- 
judgment functioiis Intermediate to those of our Line and Table subjects* 
Line-Interval a^id Line-Table subjects* contingency-^J^idgment functions were 
more symmetrical about zero than that of Line subjects, although the tw new 
conditions^ did not differ from each other. This failure to find additional 
iinprovem**nt by subjects who completed a frequency table indicates that the 
availability of summary Information contributes little to judgment accuracy. 
However, the similazlty '>f these two functions to that of subjects in our past 
broken time line condition ^nh*.nces our confidence in the problem of event 
segmenting as a source of error in judging negative relationships. 

The finding that Line-^Table judges are also less accurate than Table 
Judges is a bit of a surprise* These subjects have effectively converted time 
line mformation into a tabled format* However » the accuracy of that conversion 
is a second qu&stioo* Since any deviations in frequency judgments must 
necesfrsrlly be in the direction of lower accuracy, subjects in this condition 
may have scic^what erroneous information on which to base their Judgments* 
Eoweverj a look at subjects' frequency counts indicates reasonable accuracy; 
Indeed} 12 out of 40 subjects did not show a single error on any of the 24 
problem^*. In addition^ error rates were similar on negative and positive 
contingency problems. Thus, inaccuracy of frequency judgments constitutes a 
weak account of the difference in Judgment functions of Line-Table and Table 
subjects* 

These differences between Line-Table and Table Judgments replicate the 
stlmilus presentation iffect^ of Ward and Jenkins (1965) in a substantially 
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different format* Their subjects viewed sequences of event-outcome pairs 
(cloud seeding or no c/ rain or no rain), each sequence indicating some degree 
of positive relationship* When the sequence was complete, one group of sub" 
jects saw a table summarizing the frequencies of each of the event-state 
comblnacions + A second group saw the tabled Information only* Ward and 
Jenkins found that subjects who saw the tabled Information after the event 
series were Ivs^ accurate in their judgments than those who saw the tabled 
laformatlon alone* It was this finding that Inspired the experimenters Co 
conclude that viewing the event sequence had caused the subjects to represent 
the information in a way that the table failed to counteract, perhaps dif- 
ferentially emphasizing the relative Importance of particular event-state 
pairings. Our own results parallel these past findings closely* In our case, 
however, subjects viewed event contingencies in a linear representation free 
of memory demands. 

As In our previous experiments, subject^' judgments here were biased by 
the probability of the buzzing sound* However, unlike Experiments 2 and 3^ 
the extent of that bias was not reliably different In the Line and Table judg- 
ment conditions* The failure to replicate this finding is surprising and 
difficult to Recount for given the comparability of other aspects of the ^ 
present results to our other previous outcomes* This finding does temper our 
confidence In the previous lesult that judgments of tabled Information are 
relatively free of the effect of the probability of outcome* 

Finally, this experiment showed a reliable effect of sex, with contin- 
gency-judgBfient functions of females reliably flatter than those of males- 
Thls difference may indicate that females^ have a higher judgment error rate 
than males, contributing to flatter functions. This Interpretation is con- 
gruent 'wltn findings in our related work CShaklee it Hall, in press) showing 
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thac females use simpler, less accurate rules Chan chose used by males to 
judge event covarlaclons . An alcernaclve Incerprecaclon of the sex differences 
In Che presenc experiment is chac che cwo sexes judge che problems with similar 
accuracy, but chac the females use a more limlced range of the scale co make 
chelr judgmencs. However, a comparison of Judgmencs indlcaces thac the cwo 
sexes use che scale excremes C+A) ac comparable races (11,3% and 12.2% of 
judgmencs for males and females, respectively), ruling out response conser- , 
vaclsm as a viable account of. this sex difference* 

Concluding Comments 

In overview, the results of four different experimencs suggest that 
judgmencs of incerevent conclngency imporcantly depend on che method of 
presenting Informaclon about evenc pairings* Most accurate were judgments of 
summary table information (Experimencs 2 and 4); least accurace were judgmencs 
of Informaclon presented in a continuous clme line formac (Experiments 1, 2, 
and 4)* The accuracy of subjects judging partlcloned time lines (Experiment 
3) fell In between that of. che ocher two conditions* Subjects cralned co 
segment continuous time lines (Essperlmenc 4) made judgments similar co chose 
who saw partitioned time lines* This evidence suggescs chat Ward and Jenkins 
(1965) were correct in their suspicion chac presencatlon formac may influence 
subjeccs' treacmant of frequency inf ormation in making contingency judgmencs. 
Our evidence Indicaces that subjeccs may break evenc sequences inco different 
discrete event pairings depending on the fomint in which che frequency infor- 
maclon is presenced* This explanaclon accounts well for our own findings^ buc 
may not be similarly useful in explaining the effects o£ relaclonship direction 
m some past paradigms* As noced earlier', slide or card sequence presentaclons 
offer evenc pairings as discrece unics rather chan as event continua. 

This interpretaclon offers a ready accounc for the finding in pasc 
research that subjeccs judge n^gacive relaclonshlps less accurately chan 
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positive relationships* Past researchers have sugg^isted that subjects know 
how to judge positive, but ciot negative contingencies. Allan (1980) ^ however, 
pointed out one difficulty with* this Interpretation; subjects who only know 
how to judge positive relationships must be able to distinguish between posi- 
tive and tiegatlve relationships In order to apply the appropriate rule to 
positive contingencies. Presumably, a different, less accurate rule Is 
applied to negative (and independent) relationships. Thus, this interpreta^ 
tlon requires that an individual maintain more than ona rule to judge event 
contingencies, and that the person know when to apply which rule to which 
relationship* * 

Our analysis Indicates a single judgment problem which would result in 
differential accuracy on positive and negative relationships: that l3, sub-^ 
jects' boundaries for event segments depend on the other events In the stream* 
Positive relationships are typified by many response-outcome pairs which would 
define a brief time interval as a response-outcome unit* However, where few 
outcomes promptly follttfW responses, the observer may accept relatively delayed 
outcomes as "caused" by the response* The estimate of response-outcome pairs 
is Inflated^ resulting In an Illusion of a relat^ionship which Is less negative 
than is objectively the case* 

We would argue that the problems our subjects encounterea in the time 
line format could be similar to those encountered in judgments of real world 
contingencies— response-outcome delays may vary In everyday experience* One 
task of the perceiver Is then to define which sequences represent true response- 
outcome pairings* Investigations of the cues used to break event sequences 
Into discrete units are rare. Our evldenc_e suggests that understanding this 
process may be Important to our ability to account for contingency judgments. 
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Table 1 

Frequencies of Respoase*?Outcome Possibilities 
in EacU Experiraeacal Problem 



Problem 


Tap-Buzz 


Tap -No Buzz 


No Tap-Buzz No Tap-No 


1 


12 


0 


0 


12 


2 


9 


3 


0 


12 


3 


^ 6 


6 


0 


12 


4 


3 


9 


0 


li 


5 


1 ■? 


0 




9 


6 


9 


i 


3 


9 


7 


6 


6 


3 


9 


8 


3 


9 


3 


9 


9 


0 


12 


3 


9 


10 


12 


0 


6 


6 


il 


9 


3 


6 


0 


12 


6 


6 


6 


6 


13 


3 


9 


6 


6 


i' 


0 


12 


6 


0 


15 


12 


0 


9 , 


3 


16 


9 


3 


9 




17 


6 


6 


9 


3 


18 


3 


9 


9 


3 


19 


'" 0 


12 


9 


3 


20 


12 


0 


12 


0 


21 


9 


3 


12 


0 


22 


6 


6 


12 


0 


23 


3 


9 


12 


0 


24 


0 


12 


12 


0 
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Table 2 



Ifeans and Standard Devleclons (in Parentheses) of Subjects* Ratings 
In tbe Between- and Wlthln-SubjectB Parts of Experiment 1 



P(B) 










p(B/T) - p(B/T) 








-1.00 


-0.73 


-0,50 


-0.25 


0.00 


+0.25 


+0.50 


+0.73 


+1 .00 


— 






i 


Between 


T ^ 

Subjects 










.125 " 








-J. 57 




0.13 
















(1.33) 




(0.90) 








.250 






-0.91 




0.09 




1.30 












0.59) 




0 ..54 ) 




(1.57) 






.375 




-1.04 




-0.74 




0.17 




l.ftl 








(2.07) 




(1.48) 




(1.79) 




(1.52) 




.500 


-1.43 




0.00 




'-nil 




0.96 




2.30 




(2.10) 




(1,87) 




(2.05) 




(1.49) 




(2.37) 


.62^ 




-0.39 




-0.52 




0.39 




1.78 








(iVSJ) 




(2.00) 




(1.69) 




^(1.69) 




.750 






-0.30 




0.00 




1.63 












(J .97) 




(K14) 




(1.44) 






.«75 








-0.52 




















(2.02) 


o 


(2.02) 








1 .000 










0.09 




















(0.88) 











Within Subjects 



.U5 ^ -1.48 -0.52 

(1.36) (1.65) 
.250 -0.60 • ■ -0.^0 0.88 

(1.94) (1.72) (1.63) 

.375 -0.92 -0.48 0.40 1.96 

(1.85) - (KIO) (0.94) (K31) 

.500 -1 .J6 0.00 0.08 1.52 3.48 

(1.78) (1.36) (1.49) (1.43) . (J. 42) 

.625 0.20 0.12 1 .28 2.24 

(1 .39) (1 .27) (K22) (1.24) 

.750 0.44 0.60 2.12 

(1.39) ■ (1.20) (l.U) 

.a75 1.28 1.48 

(1.46) (J. 58) 

1.000 ' 0.92 

(1*90) 



Table ^ 



Heans and Standard Deviations (In Parentheses) of Subjects* Ratings 
Under the Time Line and Sunmiary Table Forroate of Experiment 2 



p(6/T) - p(R/T) 



Time Line 



-•\09 
(2^65) 



Summary Table 



p(B) -1.00 -0.75 -0.50 -0.25 0.00 +0.25 +0.50 +0.75 +1.00 



.125 ' -2.38 -2.09 

* (2.06) (2.13) 

.250 ' -Xo9 -J -15 0.56 



(2.20) (2.19) 
.375 -1.32 ^ -0.62 0.94 1.41 

(1.81) (1.78) (1.24) (1.97) 

. 500 "0.94 "0.26 -0.06 1.29 2.47 

(2.11) . (1.38) (1.75) (1.74) (2.29) 

.625 , 0.62 ,0.32 1.29 1.85 

. (lT85) (1.34) (1.15) (l.ao) 

. 750 0. 71 0.85 ] .8"^ 

\ \ (1.72) (J. 54) (1,77) 

.875^ ' J.'76 1.62 

(:.04) (2.00) 
1.000 0.79 

(2,26) 



.125 -1 .41 -0.21 

(2.18) ■ (2.18) 
.250 -1 .09 -0.38 0.74 

(2.72) (1.91) (2.36) 

.375 -1.03 -1.03 0.'>9 1.26 

. (2.55) (2.02) (1.49) (3 .82) 

.500 -1 .44 ] ^ -0.74 0.2^1 1.15 2.4;. 

■ (21.17)/ ' (1.87) (1.06) 0.65) (1.87) 

" .62^ -1.68 "0.06 1.03 1.24 

0 . 74 ) 0 . 55 ) 0 . 54) (2.38) 

.750 -0.29 0.50 1.62 

(2.01) (1.50 0.97) 

.875 O.'IS 0.91 

* (2.26) (2.12) 
1 . 000 0 . 50 

(1 .74) 



table i 



Meana and Standard Deviations (in Parentheeee) Subjects' Ratings 
Under the Broken and Unbroken Time Line Conditions of Experiment 3 

p(B/T) - p(B/f) 

p(B) -1 .00 -0.75 -0.50 -0.25 0.00 +0.25 +0.50 +0.75 +1 .00 

Broken Time Line C 



.125 -1.64 -0.-18 

(1.98) (] .98) 

.250 -1.36 -0.28 0.96 

(1.62) (1,37) (3,56) 

.37'j -0.96 0.36 0.56 2.24 

(2.09) (1.16) , (1.17) (1.77) 

.500 -2. J 2 0.16 0.36 1.60 3.80 

(2.63) (0.83) (1.32) (1.36) (0.98) 

.f.2S -0.60 0.52 1.68 2.08 

(1.96) (1.02) (1.26) (1.57) 

.750 0.12 1.32 1.92 

(1.39) (1.3-i) (l.-iM 

..875 O.U 1 .52 

(1.41) (1.47) 
1 .000 0.24 

(o.a6) 



Unbroken Time Line 



.125 -1.48 -0.52 

(1.36) (1 .65) 

.250 -0.60 -0.60 0.88 

(1 .94) (1 . 72) (1.63) 

. J/5 -0.92 "0.48 0.40 } .96 

(1.85) (I.IO) (0.94) (1.31) 

. ^00 -1.16 0.00 0.08 1 .52 3.48 

(1.78) (1^36) (Ki9) (1.45) (J .42) 

.625 0.20 0.12 1.23 2.24 

(1.39) (3.?7> (1 . 22) (1 .24) 

. 750 0.44 0.60 2.1 2 

(1 . 39) (1 . 20) (3.1 1 ) 

.»75 1.28 3.^8 

(1.46) (1.58) 
1.000 0.92 

(1 .90) 
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Huani and Standard Devlstions (In Parentheses) of Subjects* Ratings 
In the Four Conditions of Experiment ^ 

p(B/T) - p(B/f) 

p(B) -KOO -0.75 -0.50 -0.25 0.00 40.25 +0.50 40.75 +1.00 



Line 



.125 -1.93 -0.78 

(1.97) (2.27) 

.250 -0.78 -0.55 1.15 

(1.84) (1.72) (1.77) 
. J7i -0.98 -0.45 0.70 2.13 

(1.93) (1.73) (1.99) (1-60) 

.500 -1.28 -0.25 0.05 1.58 3.45 

(1.95) (1.32) (1.53) (1.53) (1.16) 

.b25 ^ 0.45 0.25 1.23 2.25 

(1.84) (1.32) (1.33) (1.32) 

. /50 0.55 0.55 1.60 

(1.99) (1,72) (1*77) 
.875 0.60 1.83 

(2.30) , (1.66) 

1 ,0()0 . 0.68 

(2.08) 



Line-Interval 



(1.48) (1.31) (1.07) (0.95) <0.64) 

.r.2S -O.n 0. 23 1.15 2. 70 

(1.73) (1.15) (1.11) (0.81) 

0.63 0.63 2.13 

(1.20) (1.09) (1.05) 

.tt?") 0.80 2.08 

(1.68) (1.49) 
1.000 0.20 
9^- (1.42) 



15b 



c 
p. 

(J. 



.125 -2.33 -0.33 

(1.52) (2.04) » 

,250 -2.10 -0.58 1.48 ^ 

(1.69) • (1.46) (1.38) « 

.375 -1.80 -0.60 1.28 2.60 

(1.93) fl.26) (0.89) (1.02) 



w 

.wo -2.55 -0.80 0.63 1.70 3.80 ^ \ 

c 
n 
n 
o 
a 
a> 

S? 

rt 
O 

(A 



Table 5 (coQtLnued) 



p(B/T) - p(8/T) 



p(8) -1.00 -0.75 -0.50 -0.25 O.OO +0.25 +O.50 +0.75 +1.O0 



Line-Table 



.125 -1.90 -0.70 

(1.39) (1.91) 
.250 -1.60 -0.63 0.58 

(2.31) (1.43) (1.50) 

.375 -2.48 -1.30 0.50 2.20 

(1.60) (1.27) (1.10) (1.49) 

.500 -2.28 -0.88 0.20 1.63 3.68 

(K79) (1.35) (0.90) (1.70) (0.98) 

.625 -0.73 0.08 1.23 2 70 

(1.57) fl.23) (1.21) (l.:8) 

. 750 -0.05 0.43 2.08 

(1.52) (1.28) (1.44) 

.8/5 0.33 1.68 

(1.54) (1.47) 
1.000 ^ 0.20 

(I.JIO) 

.^^ 

Table 

~ -2.03 -0.25 

(1.42) (l.?6) 
.250 -1.90 -0.38 0.68 

(1.76) (1.35) (1.79) 

.375 -2.20 -1.20 0.53 1 .93 

(1.44) (1.31) (1.40) (1.99) 

.500 -3.00 -1.73 -0.03 1.63 3.10 

(1.67) (1.28) (0.«^7) (1 .35) (1 -77) 

-1.83 -0.70 1.20 2.78 

(1.72) (1.10) (1.03) (0.88) 

.750 -0.53 0.58 1.98 

(1.76) (0.92) (1.35> 

.875 -0.43 1.58 

(1.56) (1.20) 
1.000 0.35 

(1.26) 
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Figure Captions 

Figure 1. The 24 different response-outcome problems on che coordinates 
P(B/T) and p<B/T)* The top portion locates the nine different response- 
outcome contingenciesj p<B/Tj - p<B/T) » on the unit square; the bottom portion 
locates the eight different outcome probabilities » p<B)* See text for addi- 
tional explanation. 

Figure 2* Contingency-judgment functions (left) and probabilicy-judgment 
functions (right) in the within- and between-subjeccs parts of Experiment 1* 

Figure 3, Contlngcncy-Judgment functions (left) and probability-judgiBent 
functions (right) under the time line and sunnaary table formats of Experiment 

Figure 4* Contingency-judgment functions (left) and probability-judgment 
functions (right) under the broken and unbroken time line conditions of Experi- 
ment 3^ 

Figure 5* Contingency-j udgment functions (left) and probability-judgment 
functions (right) under the four experimental conditions of Experiment 4* 
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